This action might not be possible to undo. Are you sure you want to continue?

**Unicode Nearly Plain-Text Encoding of Mathematics
**

Murray Sargent III

Office Authoring Services, Microsoft Corporation 28-Aug-06 1. Introduction ............................................................................................................ 2 2. Encoding Simple Math Expressions ...................................................................... 3 2.1 Fractions .......................................................................................................... 4 2.2 Subscripts and Superscripts........................................................................... 6 2.3 Use of the Blank (Space) Character ............................................................... 7 3. Encoding Other Math Expressions ........................................................................ 8 3.1 Delimiters ........................................................................................................ 8 3.2 Literal Operators ........................................................................................... 10 3.3 Prescripts and Above/Below Scripts .......................................................... 11 3.4 n-ary Operators ............................................................................................. 11 3.5 Mathematical Functions ............................................................................... 12 3.6 Square Roots and Radicals ........................................................................... 13 3.7 Enclosures ..................................................................................................... 13 3.8 Stretchy Characters....................................................................................... 15 3.9 Matrices ......................................................................................................... 16 3.10 Accent Operators ....................................................................................... 16 3.11 Differential, Exponential, and Imaginary Symbols ................................. 17 3.12 Unicode Subscripts and Superscripts ...................................................... 17 3.13 Concatenation Operators .......................................................................... 18 3.14 Comma, Period, and Colon ........................................................................ 18 3.15 Space Characters ....................................................................................... 19 3.16 Ordinary Text Inside Math Zones ............................................................ 20 3.17 Phantoms and Smashes ............................................................................ 21 3.18 Arbitrary Groupings .................................................................................. 21 3.19 Equation Arrays ......................................................................................... 22 3.20 Math Zones ................................................................................................. 22 3.21 Equation Numbers .................................................................................... 22 3.22 Linear Format Characters and Operands ................................................ 23 3.23 Math Features Not In Linear Format ....................................................... 26 4. Input Methods ...................................................................................................... 26 4.1 ASCII Character Translations ....................................................................... 26 4.2 Math Keyboards ............................................................................................ 27 4.3 Hexadecimal Input ........................................................................................ 28 4.4 Pull-Down Menus, Toolbars, Right-Mouse Menus ..................................... 28

Unicode Technical Note 28

1

Unicode Nearly Plain Text Encoding of Mathematics

4.5 Macros ............................................................................................................ 28 4.6 Linear Format Math Autocorrect List.......................................................... 29 4.7 Handwritten Input ........................................................................................ 29 5. Recognizing Mathematical Expressions ............................................................. 29 6. Using the Linear Format in Programming Languages ...................................... 31 6.1 Advantages of Linear Format in Programs ................................................. 32 6.2 Comparison of Programming Notations ..................................................... 33 6.3 Export to TeX ................................................................................................. 35 7. Conclusions ........................................................................................................... 36 Acknowledgements ..................................................................................................... 37 Appendix A. Linear Format Grammar ....................................................................... 37 Appendix B. Character Keywords and Properties .................................................... 39 References .................................................................................................................... 47

1. Introduction

Getting computers to understand human languages is important in increasing the utility of computers. Natural-language translation, speech recognition and generation, and programming are typical ways in which such machine comprehension plays a role. The better this comprehension, the more useful the computer, and hence there has been considerable current effort devoted to these areas since the early 1960s. Ironically one truly international human language that tends to be neglected in this connection is mathematics itself. With a few conventions, Unicode1 can encode many mathematical expressions in readable nearly plain text. Technically this f “ao o lr r r ikm mg e a ah d t tt u” iy p sm f a l ; h u a format is linear, but it can be displayed in built-up e sr n el c o” e f The t“ h n e e y . presentation form. To distinguish the two kinds of formats in this paper, we refer to the nearly plain-text format as the linear format and to the built-up presentation format as the built-up format. This linear format can be used with heuristics based on the Unicode math properties to recognize mathematical expressions without the aid of explicit math-on/off commands. The recognition iec s do f be a y’ c U i n l i i t a t d s 2 Alternatively, the linear format can be strong support for mathematical symbols. used in “ e tl u with on-off characters m sll s a ec b r t xo y either h pn t z lt h o ir e nc o ” i e e yd as used in TeX or with a character format attribute in a rich-text environment. Use of math zones is desirable, since the recognition heuristics are not infallible. The linear format is more compact and easy to read than [La]TeX,3,4 or MathML.5 However unlike those formats, iattempt to include all typographt d o e s n ’ t ical embellishments. Instead we feel itt some embellishments in ’o s h ua sn e d f l u e l the higher-level layer that handles rich text properties like text and background colors, font size, footnotes, comments, hyperlinks, etc. In principle one can extend the 2

Unicode Technical Note 28

Unicode Nearly Plain Text Encoding of Mathematics

notation to include the properties of the higher-level layer, but at the cost of reduced readability. Hence embedded in a rich-text environment, the linear format can faithfully represent rich mathematical text, whereas embedded in a plain-text environment it lacks most rich-text properties and some mathematical typographical properties. The linear format is primarily concerned with presentation, but it has some semantic features that might seem to be only content oriented, e.g., n-aryands and function-apply arguments (see Secs. 3.4 and 3.5). These have been included to aid in displaying built-up functions with proper typography, but they also help to interoperate with math-oriented programs. Most mathematical expressions can be represented unambiguously in the linear format, from which they can be exported to [La]TeX, MathML, C++, and symbolic manipulation programs. The linear format borrows notation from TeX for matl an lom eo h o tdv aa r, e b d t s mi n m j oh w aao a c ne e tlt tt ’m lh la is t sl e n i c tl e t t a he e e c it an e.g., for matrices. A variety of syntax choices can be used for a linear format. The choices made in this paper favor a number of criteria: efficient input of mathematical formulae, sufficient generality to support high-quality mathematical typography, the ability to round trip elegant mathematical text at least in a rich-text environment, and a format that resembles a real mathematical notation. Obviously compromises between these goals had to be made. The linear format is useful for 1) inputting mathematical expressions,6 2) displaying mathematics by text engines that cannot display a built-up format, and 3) computer programs. For more general storage and interchange of math expressions between math-aware programs, MathML and other higher-level languages are preferred. Section 2 motivates and illustrates the linear format for math using the fraction, subscripts, and superscripts along with a discussion of how the ASCII space U+0020 is used to build up one construct at a time. Section 3 summarizes the usage of the other constructs along with their relative precedences, which are used to simplify the notation. Section 4 discusses input methods. Section 5 gives ways to recognize mathematical expressions embedded in ordinary text. Section 6 explains how Unicode plain text can be helpful in programming languages. Section 7 gives conclusions. The appendices present a simplified linear-format grammar and a partial list of operators.

**2. Encoding Simple Math Expressions
**

Given U s p as to ASCII, how much nt o t2 relative i o rh c nte o g fm d s ra e u mi ’ p s r o t c better can a plain-text encoding of mathematical expressions look using Unicode? The most well-known ASCII encoding of such expressions is that of TeX, so we use it

Unicode Technical Note 28

3

Unicode Nearly Plain Text Encoding of Mathematics

for comparison. MathML is more verbose than TeX and some of the comparisons atNaXm cea p ao d p a i n p st i h l n c l w i g e s te y et Tn u h n t l s eo c s d o l t ’esc i . n s ns i wn h e e engineering communities, a casual glance at its representations of mathematical expressions reveals that they do not look very much like the expressions they repr not easy to make algebraic calculations by hand d i e s e n t . I t ’ s in rg e T c e t X l ’ y s u s notation. With Unicode, one can represent mathematical expressions more readably, and the resulting nearly plain text can often be used with few or no modifications for such calculations. This capability is considerably enhanced by using the linear format in a system that can also display and edit the mathematics in built-up form. The present section introduces the linear format with fractions, subscripts, and superscripts. It concludes with a subsection on how the ASCII space character U+0020 is used to build up one construct at a time. This is a key idea that makes the linear format ideal for inputting mathematical formulae. In general where syntax and semantic choices were made, input convenience was given high priority.

2.1

Fractions

One way to specify a fraction linearly is LaT e X ’ s \frac{numerator}{denominator}. The { } are not printed when the fraction is built up. These simple rules immediately ga h ii a vn t e ti a x unambiguous, but looks quite different from the corres“t p” lt es ponding mathematical notation, thereby making it harder to read. Instead we define a simple operand to consist of all consecutive letters and decimal digits, i.e., a span of alphanumeric characters, those belonging to the Lx and Nd General Categories (see The Unicode Standard 4.0,1 Table 4-2. General Category). As such, a simple numerator or denominator is terminated by most nonalphanumeric characters, including, for example, arithmetic operators, the blank (U+0020), and Unicode characters in the ranges U+2200— U+2500— and U+23FF, U+27FF, U+2900— The fraction operator is given by the usual solidus / (U+002F). U+2AFF. So the simple built-up fraction 𝑎 𝑏 𝑐 . 𝑑 appears in linear format as abc/d. To force a display of a normal-size linear fraction, one can use \/ (backslash followed by slash). For more complicated operands (such as those that include operators), parentheses ( ), brackets [ ], or braces { } can be used to enclose the desired character combinations. If parentheses are used and the outermost parentheses are preceded and followed by operators, those parentheses are not displayed in built-up form, since usually one does not want to see such parentheses. So the plain text (a + c)/d displays as 𝑎 + 𝑐 . 𝑑

4

Unicode Technical Note 28

this approach leads to plain text that is easier to read than LaT. while TeX requires { }’ eloutermost parentheses. and the circled l1 a 5 s (\ldiv) h ” U + 2 slash ⊘ \ndiv) builds up a small numeric fraction (although characters (U+2298. Contrast this with the MathML version. \frac{a + c}{d}. which (with no parentheses) reads as <mfrac> <mrow> <mi>a</mi> <mo>+</mo> <mi>c</mi> </mrow> <mi>d</mi> </mfrac> Three built-unna e s 0 p v ab ra 4 fa rlas 4 r r ee ch ai a toU ca vh n + tt i “ l ii l f oo a s : t” i 2 (which one might input by typing \sdiv) builds up to a skewed fraction. ho T ef o d the fi os rp c a within parentheses. For example. 3. e . 𝑑 A really neat feature of this notation is that the plain text is. the binomial theorem s ‘ ¦ ’ (\atop). 𝑑 𝑏 + 𝑓 𝑑 𝑒 + 𝑓 + 𝑐 + 𝑓 𝑒 𝑒 The same notational syntax i a h aw s “ i fi u s c rt s t h ah ea i cn dcs to fk l i o” io r wk n e fraction bar. in turn. parentheses are not needed. The stack is used to create binomial coefficients and the stack operator i For example. in fact. one encloses them. often a legitimate mathematical notation in its own right. The three kinds of built-up fractions are illustrated by 𝑎 𝑎 𝑎 𝑑 𝑏 𝑏 + 𝑐 𝑐 + . which then become the outermost parentheses. ((a + c))/d displays as 𝑎 + 𝑐 .4 for a discussion of the n-aryand “ g l u e ” operator ▒ ) (a + b)^n = ▒ k) a^k b^(n-k). e X ’ s . since in many cases. so it is relatively easy to read.Unicode Nearly Plain Text Encoding of Mathematics In practice. g . 𝑛 𝑎 + 𝑏 = 𝑛 𝑘 =0 𝑛 𝑘 𝑛 − 𝑘 𝑎 𝑏 𝑘 in linear format reads as (see Sec. . to hn e “ d i v i s i s 2 builds up to a potentially large linear fraction. s ty . other than digits can be used as well). ∑ (n ¦ _ ( k=0)^n Unicode Technical Note 28 5 .

For example. This refinement obviates the need for many overriding parentheses. 2. Compound subscripts and superscripts include expressions within parentheses. Parentheses are needed for constructs such as a subscripted superscript like 𝑏 𝑏 𝑐 𝑎 is given by a^(b_c). as well as the invisible comma.e. The summation limits use the subscript/superscript notation discussed in the next subsection. thereby yielding a more readable linear-format text (see Sec. since a^b_c displays as 𝑎 a_c^b). In TeX. then the comma or period appears on line. 𝜈 . a pair of subscripts. Typically the base is just a single math italic character like the a in these examples. which (as does 𝑐 up program is responsible for figuring out what the subscript or superscript base is. case). As an example of a slightly more complicated example. But it could be a bracketed expression or the name of a mathematical function like sin as in sin^2 x. Similarly. if a subscript operand is followed directly by a comma or a period that is. So 𝛿 is written as 𝛿 In addition it is _(𝜇 + 𝜈 ).5 for more discussion of this (see Sec. 𝜇 +𝜈 worthwhile to treat two more operators. ^3𝛽where _𝛿 1 𝜌 2.2 Subscripts and Superscripts S as ai e q a u nc br ’ i l b dr i be e . which renders as sin2 𝑥 3. which works using right-to-left associativity. So a^b means 𝑎 enhancement for . a_b_c stands for 𝑎 . A simple subscript operand consists of the string of one or more characters with the General Categories Lx (alphabetic) and Nd (decimal digits). In Indic and other cluster-oriented scripts the base is by default the cluster preceding the subscript or superscript operator. Another kind of compound subscript is a subscripted subscript.g. e. followed by whitespace. we introduce a subscript by a subscript operator. The build. A nice a text processing system with build-up capabilities is to display the _ as a small subscript down arrow and the ^ as a small superscript up arrow. which we display as the ASCII underscore _ as in TeX. i. 𝑏 which we display as the ASCII ^ as in TeX. such as 𝛿 𝜇 𝜈 is written as 𝛿 _𝜇 superscripts are introduced by a superscript operator. is treated as the operator that terminates the subscript.. and curly braces. in order to convey the semantics of these build-up operators in a math context. 𝛿which can be written with the linear format 𝑊 1 𝜍 Unicode 1 𝜌 1 𝜍 2 numeric subscripts are used. square brackets. Specifically. in special ways. It can also be an operator. one types 6 Unicode Technical Note 28 . 3.. the comma and the period. consider the expression 3𝛽 𝑊 .14 for more discussion of comma and period). as in the examples +1 and =2. i Specifically. However a comma or period followed by an alphanumeric is treated as part of the subscript. s s tt u sr c us t tte r p a r t la i e ri hl d p r ec y u b t s i p k e r te . in turn.Unicode Nearly Plain Text Encoding of Mathematics where (n ¦ k) is the binomial coefficient for the combinations of n items grouped k at a time. Similarly a^b^c stands for 𝑏 𝑐 𝑐 𝑏 𝑎 .

TeX becomes cumbersome for longer equations such as 2𝛽 1𝛽 ′ 𝛼 𝑈 𝑈 2 − 𝛼 1 𝛿 2 𝜌 1 𝜍 2 3𝛽 3𝛽 1 𝜌 1 ′ 𝑊 𝑈 2 = + 𝑑 0𝛽 𝛼 2 𝛿 𝜌 𝛿 𝜍 1 𝜌 1 1 2 1 8𝜋 𝑈 𝛼 1 𝜌 𝜍 1 2 A linear-format version of this reads as W 3 ^ ^αU -α _β 3 22▒ [ β δ = β ∫_α’^ U+ δ/ 1 8 d1ρ α1 2 _’ δ2 1ρ _ 1 1ς 1ρ π^ ( 2^ 1 2 U )0 _/ β ρU ] 1 2^ β _ 1ς ρ 2^ ς 1 while the standard TeX version reads as $$W_{\delta_1\rho_1\sigma_2}^{3\beta} = U_{\delta_1\rho_1}^{3\beta} + {1 \over 8\pi^2} \int_{\alpha_1}^{\alpha_2} d\a l p h a _ 2 ’ \left[ {U_{\delta_1\rho_1}^{2\beta} . As a practical matter.\a l p h a _ 2 ’ U_{\rho_1\sigma_2}^{1\beta} \over U_{\rho_1\sigma_2}^{0\beta}} \right] $$ . ₂ h ³ i ) w l e reads as $$\alpha_2^3 \over \beta_2^3 + \gamma_2^3$$· The linear-format text is a legitimate mathematical expression. since 2 { }3 $ β o }_{δ r superscripts. while the TeX version bears no resemblance to a mathematical expression.Unicode Nearly Plain Text Encoding of Mathematics $W^{3\beta}_{\delta_1\rho_1\sigma_2}$ The TeX version looks simpler using Unicodey m{ f b l }_{δ o oy r l $ _1 ts W h . ^ en 3 sa β m e ρ $ 1ρ Unicode has a full set of decimal subscripts and _ςW 1ς _ ^ 2}$. so the major simplification is that fewer brackets are needed. For the ratio 3 𝛼 2 3 3 𝛽 2 + 𝛾 2 the linear-format text can read as 𝛼+ 𝛾 the standard TeX version ₂ ₂ ³ ³ /( 𝛽. However the space character is very useful for delimiting the operands of the linear-format nota- Unicode Technical Note 28 7 .3 Use of the Blank (Space) Character The ASCII space character U+0020 is rarely needed for explicit spacing of builtup text since the spacing around operators is should be provided automatically by the math display engine (Sec. 3. numeric subscripts are typically entered using an underscore and the number followed by a space or an operator. 2.15 discusses this automatic spacing).

and parentheses ( ) represent themselves in the Unicode plain text. et a’ ri fs ot r hat m s wn r l d i s ht f no t or keywords do ree erkn o r l at d e e i t a la a n b ti n se a ia ho s’ p m h er t m d sult in spacing. subscripts and superscripts in the linear format and gives a feel for that format. Encoding Other Math Expressions The previous section describes how we encode fractions. A delimited pair need not consist of the same k eF l t i l} and one il o e o t o ni r. two spaces are needed if spaces are used for build up. f w iIn i us s d refer to such characters as delimiters.14). Delimiters are called fences in MathML. In a nested subscript/superscript expression. Additional spacing characters are discussed in Sec. it is eliminated upon build up.n n a l Ac T e ie e g n a in TeX. the space builds up one script at a time. 3. If a space precedes the numerator of a fraction. and colon in various contexts (see Sec. For example. The current section describes how we encode other mathematical constructs using this approach and ends with a more formal discussion of the linear format.3). The closing delimiter can have a subscript and/or a superscript. Similarly a_1 b_2 builds up as a1b2 with no intervening space. o [e d iei p a w s t x’ e n i oe a f n d t fr m i w c d. p e m s t s n h s h sees this usage in some mathematical documents. period. itta h b t e e wα t a y o s i hrh f a g pm o se e a hc l l o e t se ee h l d l s \alpha. One displayed use for spaces is in overriding the algorithm that decides that an asir + nl a e m u at oao s bnr r − . the space is eliminated since it may be necessary to delimit the start of the numerator.5) or before above/below scripts (see Sec. Another example is that a space following the denominator of a fraction is eliminated.e a iay l i f be g r oi s f y t u y p e uo o/ e ub a rrw p n o k y dc I l . since it causes the fraction to build up. it is eliminated since it delimits the start of those constructs. the space character is also used to delimit control words like \alpha and does not appear in built-u d b e a p i eX n ff t ’ d of w s t r r eu h m e es e . to build up a^b^c to ab . and a word processing system capable of displaying built-up formulas should be able to enlarge tt h e general we ha a t er th mo ’ e t n im o dn . blanks are invariably eliminated in built-up display. When the space plays this role. h o s e i t ia eo p cdn h s y s e o t r stda r no y p d S a a s ba ai p l t i e n cs a s o d bd ep c re ir a en e r ’ . 3. Some other operator like + builds up the whole expression.1 Delimiters Brackets [ ]. c 3. since the operands are unambiguously terminated by such operators. 3.15. 8 Unicode Technical Note 28 . 3. Similarly if a space is used before a function-apply construct (see Sec. 3.e l used to obtain the correct spacing around comma. braces { }.Unicode Nearly Plain Text Encoding of Mathematics tion. In TeX. So if you type \a w pta ieee l ea αcn np p dc .

The evenness of its count at any given bracket nesting level typically determines whether the vertical bar is a close |. TeX uses \left and \right for this purpose instead of \open and \close. Some parsed without the clarifying parentheses by noting that a vertical bar | directly following an operator is an open |.19 (&x" 0 @ − &x" for a discussion of the equation-array operator █ ). −0 𝑥 if 𝑥 < which has the linear format “ |x| = {█ if "x ≥ if "x < 0)┤" (see Sec. e “ hh o x ] ie r p a + b[ a a“ ” a blank folr e s ” s r├]a+b┤[ s . but the flexibility can be worth the compromise especially when interoperating with ordinarily built-up text such as in a WYSIWYG math system. l nr o i m o followed by a normal delimiter without interpreting the pair as a single delimiter. The usage of open and close delimiters in the linear format is admittedly a compromise between the explicit nature of TeX and the desire for a legitimate math notation. The open/close delimiters can be used to overrule the normal open/close character of delimiters as in the admittedly strange. But the example |a|b− the clarifying pac|d| needs rentheses since it can be interpreted as either (|a|b) |a(|b− usual − ( c|d|) or c|)d|. such as this one. the next a close | (unless following an operator). the built-up expression ||x| .|y|| can have the linear format |(|x|− cases. Box drawings characters are used for tc iur t a a hl t se o s t e se en b mi oer tt e aa p d b ei u tl e e e y es hc n l c ale eh /i a o s hl m ’ k y d mac racters and they are readily available in fonts. follow the├ by a space to force the open-delimiter interpretation. defining the corresponding end of a delimited expression. tf i w o n cl a hi t . We use the latter since they apply to right-to-left mathematics used in many Arabic locales as well as to the usual left-to-right mathematics. the special keywords \open and \close can be used. the next an open |. Absolute values are represented by the ASCII vertical bar | (U+007C).Unicode Nearly Plain Text Encoding of Mathematics These choices suffice for most cases of interest. A common use of t“ a as hc t ia i s so ien s ” . but nevertheless sometimes used. But to allow for use of a delimiter without a matching delimiter and to overrule the open/close character of delimiters. respectively. If a├ needs to be treated as an empty open delimiter when it appears before a delimiter like | or ]. If used before any ct hh aa rt ai cs tn e’ r t a delimiter of the opposite sense. such te hq eu 𝑥 if 𝑥 ≥ 0 |𝑥 |= . Nested absolute values can be handled unambiguously by discarding the outermost parentheses within an absolute value. the first appearance is considered to be an open | (unless subscripted or superscripted). These translate to the boxdrawings characters├ and ┤.18 on how to make arbitrary groupings. For example. The Unicode Technical Note 28 9 . See also Sec. Specifically. can be |y|)|. 3. and so forth. 3. he n m Note that l oo tes ae b o psen tnl e w eer”o o i i n di Ta p t n oe “hlee g r iei w dt a c i n l l a i m t s s . the open/close delimiter acts as an invisible delimiter.

10 Unicode Technical Note 28 . with no attempt by the build-up software to match a corresponding right square bracket. even a char ‘ dh a + ie c ’ s s in linear format as in built-up form t. but it supports the vertical bar separator \vbar. s c. in expressions like {𝑥 0}. 3. etc. This separator also grows in size to match the | 𝑓 𝑥 = surrounding brackets and is spaced as a relational. The vertical bar separator grows in size to match the size of the surrounding brackets. integral. parentheses. Such quoted operators are automatically included in the current operand. One case using U+007C as a separator that can be deciphered is that of the form (a|b). it is treated as a script base character. And one might want to interpret the | in (a|b) as an open delimiter w h o seh edeie n er e t. In fact. But (a|b|c) interprets the vertical bars as the absolute value.2 Literal Operators Certain operators like brackets. 𝜓 where the vertical bars can be input using the ASCII vertical bar. Another case where we treat | as a close delimiter is if it is followed by a space (U+0020). plays a role in the linear format in that it terminates an operand. The Unicode norm delimiter U+2016 (‖ has the same open/close or \norm) definitions as the absolute value character | e ty r x i cd ct o t e’ no p a sb tl i e tw d h ae a se s a delimiter. If a | is followed by a subscript and/or a superscript and has no corresponding open |. the quantum mechanical density operator ρ the definition has 𝜌 𝜓 = 𝑃 𝜓 𝜓 . Another common separator is the \mid character ∣ commonly used (U+2223). t cd d w(mI h o i e hi as ) rn l into ar g i l ’ h p s sc i t ye r tpl t e o .a d o at T os hrh o en ro t formalize the comma separators of function arguments (MathML does). have special meaning in the linear-format notation. i. This handles the important case of the bra vector in Dirac notation. superscript. (├|b). We tried using the ASCII | (U+007C) for this purpose too. So \[ is displayed e po “er lr w ia h tt e .e. Delimiters c aa i hf e a vtn e r n n e r t lm ’ a ssh i a l e w m et s p i ... subscript. For example. but the resulting ambiguities are insurmountable in general.Unicode Nearly Plain Text Encoding of Mathematics algorithm gives the former. m et f cede the | by├.e. one can type |(a|b− c|d)|. so if one wants the latter without the inner parentheses. i. as an ordinary left square bracket. p a ew lm r ha e l iy glyph i cs k h t e (aside from a possible size reduction).. Its built-up size should be the height of the integral sign in the current display/inline mode. To remove the linear-format role of such an operator. which is represented by the box drawings light vertical character│(U+2502). not a delimiter. where a and b are mathematical expressions. braces. we precede it b ro i y arc tl”h h o fthe backslash \ is handy.

So in display mode. Pr.. lim inf. 3. the belowscript examples are shown here in-line with the script below.8 for some examples.10). lim𝑛 can also 𝑎 → ∞ 𝑛 be represented by lim_(n→∞ ) a_n. When inline. the n-aryand is the integrand. For this (_c^b)a creates the prescripted b variable c a. When entered with ┬ and ┴. they remain below and above scripts in-line.4 n-ary Operators n-ary operators like integral. Hence the expression lim𝑛 can be represented by lim┬ (n→∞ the operations det. max. since the control word typically looks like English. lim sup. this is done by following the sub/superscripted n-ary operator by the naryand concatenation operator \naryand (▒ ) which is U+2592. Below scripts and above scripts are represented in general by the line drawing operators \below (┬) and \above (┴). and sup are common. as are the limits for n-ary operators. gcd. subscripts and superscripts that precede their base. min. inf. summation and product are sub/superscripted or above/below operators that have a ue d i tm“” n h e rr r i n y ta r t a hl d : ne at rh g n-ao g . their below scripts are also accessible by the usual subscript operator _. In the ol t e f t ga i su i rn c . lf ad p i t t p s ut o h h e p ud e ’ y linear format. which is a little easier to type than lim┬(n→∞ ) a_n. 3.and abovescripts entered with _ and ^ are shown as subscripts and superscripts. Using a single character has the advantage of being globalized. 3. respectively. 3. 𝑎 ) a_n. although a control word like \open may be used to input the character. Users can define other control words that look like words in other languages just so long as they map into the appropriate operator characters. If an above/below operator or a subscript/superscript operator is preceded by an operator. that is. A slight exception to the single-character operator rule occurs for accent operators that are applied to two or more characters (see Sec. and for the summation. below. For these the accent combining mark may be preceded by a no-break space for the sake of readability.3 Prescripts and Above/Below Scripts A special parenthesized syntax is used to form prescripts. ordinarily this choice is only for display-mode math. Variables can have both prescripts and postscripts (ordinary sub/superscripts). i t ’ s the summand. respectively.t Fe . Since → ∞ 𝑛 lim. Another advantage of using operator characters rather than control words is that the build-up processing is simplified and therefore faster.Unicode Nearly Plain Text Encoding of Mathematics Linear format operators always consist of a single Unicode character. Although for illustration purposes. See Sec. For both t i mrs i e yc a o ses pa n s e n n-aryands. The operand that Unicode Technical Note 28 11 . that operator becomes the base.

3. I ib wi ic y n t l t t ln a ’e h h e A ds t e e F p dd o m I u p ie f w v t ts o i i l ts i o r l hio na o nnl b operator U+2061 (\funcapply).16). ” Sometimes one wants to control the positions of the limit expressions explicitly as T i e (upper limit above. when it To delimit more complicated n-aryands without using parentheses or brackets of some kind. This is a special binary operator and the operand that follows it is the function argument. To this end. if the n-ary operator is followed by the digit 1. this operator 12 Unicode Technical Note 28 . In converting to built-up form.5 Mathematical Functions Mioa m c “ l aans et s d tls t to i b h fs i i s ” e eu ug c l s m n co f i h at ti c c hn u k o r r nn one u i recognized as such and not italicized. they are displayed as superscript and subscript. the number can be one of the first four of the following along with neither or one of the last two nLimitsDefault nLimitsUnderOver nLimitsSubSup nUpperLimitAsSuperScript fDontGrowWithContent fGrowWithContent 0 1 2 3 64 128 3. the linear-format expression ∫x/(x^2+a^2) has the built up form _ 0 ^a▒ xⅆ 𝑎 0 𝑥 𝑑 𝑥 2 + 𝑎 2 𝑥 where xⅆ x/(x^2+a^2) is the integrand and ⅆ is the Unicode differential character U+2146.18) delimiters. This ao yeh lo is one-half b lt s li o of xdx ii e in m av nk t 0 sa te e t ie e “ r sd nTa b m bc e f t s g r s q u a r e d .Unicode Nearly Plain Text Encoding of Mathematics follows this operator becomes the n-aryand. use the \begin \end (〖〗see Sec. Notice that the ⅆ automaticharacter cally leads to a small space between the 𝑥𝑑 default displays as a and the 𝑥 and by math-italic 𝑑appears in a math zone. Since \n nse alias \of can be used. which disappear on build up. 3. the outermost parentheses of a n-aryand are not removed on buildup. since parentheses are commonly used to delimit compound n-aryands. d m t the i ov s i i so works as an alias for \funcapply in math function contexts (see Sec. the limit expressions are displayed above and below the n-ary operator and if followed by the digit 2. As such they are treated as ordinary text (see Sec. 3. This ala ’t n rt i a y tn m ah t e neu . lower below) and \nolimits (upper nX u’ s \limits i n g s limit as superscript and lower as subscript) control words. Unlike with the fraction numerator and denominator.5). More completely. For example.

le i y mt i apply contexts. since parentheses are commonly used to delimit function arguments. 3.7 Enclosures To enclose an expression in a rectangle one uses the rectangle operator ▭ (U+25AD) followed by the operand representing the expression. For example. In general. This syntax is simi2 lar to that for the square root. e.g. For example ▭ ^2) displays as 𝐸 The (𝐸 = 𝑚 𝑐 = 𝑚 𝑐 . w y o l io hn 3. can g c tsi vei i o se s t t “ v n e. use the〖 〗delimiters (\begin \end) which disappear on build up. m uli i p pm n t l o ”. i. “tin lhs t i ere m sp b sp r y u ae a pcs ”e e . Anything following the closing parenthesis is not 𝑛 part of the radicand. ∙ \cbrt). 3. \of autocorrects to ▒ ( U+2592— see Sec.6 Square Roots and Radicals Square. If the Function Apply operator is immediately followed by a subscript or superscript expression. If b r e i c m f a. ie e T e v cT s d n h i eo h m bc e t nn i oy s t id a t s l h e u li n e i n s s aa e k sine of 2x equals twice the sine of x times the cosine of x” = 2 sin 𝑥 .. h et t n n s t e dc h ’ t ah i u o es o For example.e.. 3. sin 𝑥 sin 𝑥 get sin 𝜔 where 𝑡 the 𝑏 means × 𝑏 .4). If a function name has a space in it. For example. sin 2𝑥 cos 𝑥 . same approach is used to put an overbar above an expression. and √ \qdrt).15. but context \naryand. use the operator ▁ (U+2581). which display as 𝑎 𝑏+ 𝑑 ( a+b) 𝑏 . 3 are √ and ∙ abc.Unicode Nearly Plain Text Encoding of Mathematics transforms its operands into a two-argument object that renders with the proper spacing for mathematical functions. 𝑎 + . (U+221A. Since \f i ov\of can be used in functionu s se nn t n c’ i a at n m pt t e ph u . √ (c+d). ed no-break space (U+00A0) as described in Sec. 0 𝑡 is part of argument. the nt d p b r e h i r ye √ r ae as ( o ls ni n&a). namely follow the overbar operator ¯ (U+00AF) by the desired operand. (𝑛displays as 𝑎 &𝑎 + 𝑏 ) + 𝑏 . For an underbar. that expression should be applied to the function name and the Function Apply operator moved passed the modified name to bind the operand that follows as the function argument. o in e l ts t x i r re p k a ed c e s o n where a is the complete radicand. To − 𝜔 . and 𝑐 𝑐 . respectively. the function sin2 x falls into this category. Examples (U+221C. These operators include the operand that follows. To delimit a more complicated arguments without using parentheses or brackets of some kind. Unicode Technical Note 28 13 . If an ordinary ASCII space were u ubf f s lutu ed i e c d id “t . or enclose the argument in〖 〗delimiters. re yr o pu r a u a n n rn g c s n c t i cu k e dl e s t m e d t dn e i e t . (U+221B. cube. one can use sin\funcapply(\omega-\omega_0)t. Unlike with the fraction numerator and denominator. the outermost parentheses of the second operand of the function-apply operator are not removed on buildup. and quartic roots can be represented by expressions started by the corresponding Unicode radical characters √ \sqrt).

and ellipse can be encoded as for the rectangle operator but using appropriate Unicode characters (not yet chosen here). which are represented as discussed in the previous Section. size style. but at least the choice can be preserved in the linear format. and enclosure forms defined by the MathML <menclose> element. horizontal. and diagonal strikeouts. circle.Unicode Nearly Plain Text Encoding of Mathematics In general the rectangle function can represent any combination of borders. spacing category. one Size can be a option and any flags in the following table: nAlignBaseline nAlignCenter nSpaceDefault nSpaceUnary nSpaceBinary nSpaceRelational nSpaceSkip nSpaceOrd nSpaceDifferential nSizeDefault nSizeText nSizeScript nSizeScriptScript 0 1 0 4 8 12 16 20 24 0 32 64 96 14 Unicode Technical Note 28 . where U+25A1 and 𝑛 combination of one Align option. is ▭ (𝑛 &𝑥 where 𝑛 consisting of any combination of the following flags: is a mask fBoxHideTop fBoxHideBottom fBoxHideLeft fBoxHideRight fBoxStrikeH fBoxStrikeV fBoxStrikeTLBR fBoxStrikeBLTR 1 2 4 8 16 32 64 128 It is anticipated that the enclosure format number n is chosen via some kind of friendly user interface. long division. This is defined by □(𝑛 □ is &𝑥 ). one Space option. and other properties. Other enclosures such as rounded box. Note that the overbar function can also be given by ▭ the underbar by (2&𝑥 ) and ▭ (8&𝑥 ). except for roots. An abstract box can be put around an expression x to change alignment. actuarial. The general syntax for enclosing an expression 𝑥 ). vertical.

The horizontal stretchable brackets are given in the following table U+23DC ⏜ U+23DD ⏝ U+23DE ⏞ U+23DF ⏟ U+23E0 ⏠ U+23B4 ⎴ U+23B5 ⎵ There are many other characters that can stretch horizontally to fit text. respectively. 3.8 Stretchy Characters In addition to overbars and underbars. stretchable brackets are used in mathematical text. Illustrating the linear format for these four cases with the stretchy character → and the text 𝑎have + 𝑏 . such as various horizontal arrows. Here the subscript and superscript operators are used for convenient keyboard entry (and compatibility with TeX). we (𝑎 𝑎 + 𝑏 + 𝑏 )┴ → (𝑎 𝑎 + 𝑏 + 𝑏 )┬ → → ┴ 𝑎 + 𝑏 →) ┬ (𝑎 + 𝑏 𝑎 +𝑏 >0 𝑎 +𝑏 Unicode Technical Note 28 15 . respectively.3’ s b elow/abovescript operators. F ld a r s oe en a r. There are four configurations: a stretch character above or below a baseline text. r d c ehb “e x er o” a “ cv a m u eer pn ”re t a ba 𝑘 times 𝑥 +⋯ + 𝑥 𝑥 + 𝑦 + 𝑧 The linear formats for these are ⏞ (x+⋯"times") and ⏟ +x)^(k (x+y+z)_(>0). and text above or below a baseline stretched character. one can also use Sec.Unicode Nearly Plain Text Encoding of Mathematics fBreakable fXPositioning fXSpacing 128 256 512 3.

the following table \hat \check \tilde \acute U+0302 U+030C U+0303 U+0301 𝑎 𝑎 𝑎 𝑎 16 Unicode Technical Note 28 .12). where n is the maximum number of elements in a row and m is the number of rows. which can be typed in using the ASCII apostrophe '. Here for convenience.9 Matrices M r n ort a xa retyo m t eda sT e r p bi me l i r y n i’ a c e a va . This causes exp1 to be aligned over exp n-1. to build up an n×m matrix array. 3.10 Accent Operators ′ Mathematics often has accented characters.. but in a math zone it gets translated into U+2217. except for the last row which is terminated by the closing paren. Simple primed characters like 𝑎 are represented by the character followed by the Unicode prime U+2032. Unicode has multiple prime characters that render with somewhat different spacing than concatenations of U+2032. The combining marks are found in the Unicode ranges U+0300— U+036F and U+20D0 – The most common math accents are summarized in U+20FF. In addition. As an example. the user has to include it in a superscript or subscript expression. etc. The primes are special in that they need to be superscripted with appropriate use of heavier glyph variants (see Sec.n e s ner n e s e a t t i Xy e o ls pression of the form ■ (exp1 [& exp2] n-1 [& expn] … @… exp … ) where ■ is the matrix character U+25A0 and @ is used to terminate rows. The matrix is constructed with enough columns to accommodate the row with the largest number of entries. which is placed on the math axis as the +. Double primed characters have two Unicode primes. a*2 has the linear format version a^*2 or a^(*2). the asterisk is treated as an operand character if it follows a subscript or superscript operator. with rows having fewer entries given sufficient null entries to keep the table n×m. The ASCII asterisk is raised in ordinary text. Other kinds of accented characters can be represented by Unicode combining mark sequences. For example. etc. ■ &𝑑 as (𝑎 ) displays &𝑏 @𝑐 𝑎 𝑏 𝑐 𝑑 3. To make it a superscript or subscript.Unicode Nearly Plain Text Encoding of Mathematics 3.

U+00B9. respectively. The linear format treats these symbols as operand characters. respectively. U+2070— and a similar set of subscripts (U+2080— that U+207F) U+208F) should be rendered the same way that scripts of the corresponding script nesting level would be rendered. ⅈ . ⅈ .11 Differential. d. as defined. spans of which combine into the corresponding superscripts or subscripts when built up. They have the ⅆ . Since this construct looks funny when rendered by plain-text programs. i. ⅇ . and Imaginary Symbols Unicode contains a number of special double-struck math italic symbols that are useful for both typographical and semantic purposes. meanings of differential. imaginary unit. Special cases of this notation include overscoring (use U+0305) and underscoring (use U+0332) mathematical expressions. ⅆ . but in regular US technical publications. ⅇ . For example. In US patent applications these characters should be rendered as ⅅⅉ . 3. they are sometimes rendered as upright characters. In European technical publications. 3. (𝑎 + 𝑏 as 𝑎 built ) ̂ + 𝑏 renders when up. See Sec. these characters can be treated as high-precedence operators. and j (ⅅ ⅉ . Exponential.12 Unicode Subscripts and Superscripts Unicode contains a small set of mostly numeric superscripts (U+00B2. but the display routines should provide the appropriate glyphs and spacings.4 for an example of an integral using ⅆ . 3. a no-break space (U+00A0) can appear in between the parentheses and the combining mark. These are U+2145— U+2149 for double-struck D. differential.Unicode Nearly Plain Text Encoding of Mathematics \grave \dot \ddot \dddot \bar \vec U+0300 U+0307 U+0308 U+20DB U+0304 U+20D7 𝑎 𝑎 𝑎 𝑎 𝑎 𝑎 If a combining mark should be applied to more than one character or to an expression. and imaginary unit. these quantities can be rendered as math italic. To perform this translation. U+00B3. . natural exponent. that character or expression should be enclosed in parentheses and followed by the combining mark. The combining marks are treated by a mathematics renderer as operators that translate into special accent built-up functions with the proper spacing for mathematical variables. Since numeric subscripts and super- Unicode Technical Note 28 17 . Furthermore the D and d start a differential expression and should have appropriate spacing for differentials. e.

In either punctuation case the comma is displayed with a small space following it.g. For this if the period is followed by an ASCII digit and 1) is at the start of a math zone. When it follows a variable. Comma: when surrounded by ASCII digits render with ordinary text spacing. the comma is rendered as a clause separator (a relatively large space follows the comma).13 Concatenation Operators A i o n nr ln r c o” because they l g s a pso named r oat e e p re r m ee n a a r “a t i a ct o n t oi s o are concatenated with their surrounding text in built-up form. Period: when surrounded by ASCII digits render with ordinary text spacing. 𝑎 be converted into a su. 3. i ha e t it e ’ li n s en s w tg t o rb a r a e nt nt h sw w l dard built-up scripts in built-up format and the Unicode scripts in linear format... then the period should be classified as a decimal point. 𝑎 display ′ ^𝑐 should ′ 𝑐 as 𝑎 both the prime and the c are in the same superscript argument. period.14 Comma. the user rarely needs to add explicit space characters. where 3. use numbers like . 3. and Colon The comma. e. An extended decimal-point heuristic useful in calculator scenarios allows one to omit a leading 0.Unicode Nearly Plain Text Encoding of Mathematics scripts are very common in mathematics.15 along with the mathematical spacing tables of the font.. Else treat as punctuation with or without an ASCII blank following it. or 3) follows any operator except for closers and punctuation. e. it should perscript function with a as the base and the prime as the superscripo ti .5. mI t ’ s a l s portant to merge the prime into a superscript that follows.g. In either punctuation case the period is displayed with a small space following it. Since the spacing around operators is well-defined in this way. e. If two spaces follow. . The prime U+2032 and related multiple prime characters should also be treated as superscript operators. 2) follows a built-up math object start character or end-of-argument character. a/. With this algorithm.g. In addition a concatenation operator has two effects: 1) it terminates whatever operand precedes it. and 2) it implies appropriate surrounding space as discussed in Sec.3 displays as 18 Unicode Technical Note 28 . The ASCII apostrophe can be used to in′ put the prime. Else treat as punctuation with or without an ASCII blank following it. and colon have context sensitve spacing requirements that can be represented in the linear format. No clause separator option exists for the period. Period. Display routines should use an appropriate glyph variant to render the superscripted prime.

In mathematical typography.15 Space Characters Unicode contains numerous space characters with various widths and properties. 2. 3.3. Unlike the ASCII space. If an ASCII space i d m “ s l ui pm ” inf” (U+0020) were used a l u b f f fi l u t u tm d i h n e ” il e c r . “o mn rn ri “ n a b o smc t e f u io h i t p nn e n h ” f.Unicode Nearly Plain Text Encoding of Mathematics 𝑎 . sh no c c s e tu e u l (U+200B) to be treated as an operand character and not to cause build up of the preceding operator.3 Colon: <space> ‘ :’ as Unicode RATIO U+2236 with relationis displayed al spacing. The no-break space is also treated as an operand character so that linear format combinations like “a l n can be recognized as single operands. spaces act as concatenation operators and cause build up of higherp erems r -width space r o t dB e e e p h eu f z c ea t i fr e rt h t o o d a p e’ t e t r . 205F. md “t t i pu li ht lp io e w y o ”. the other spaces are not removed on build up. Various space widths are defined in the following table. r g e o” t p “ rf h al l t a t m i i u Unicode Technical Note 28 19 . These characters can be useful in tweaking the spacing in mathematical expressions. Spaces of interest include the no-break space (U+00A0) and the spaces U+2000— 202F. ‘ a leading space is displayed as itself with punctua:’ without tion spacing. which is removed when causing build up as discussed in Sec. the widths of spaces are usually given in integral multiples of an eighteenth of an em. which includes the corresponding MathML names having these widths by default Space 0 em 1/18 em 2/18 em 3/18 em 4/18 em 5/18 em 6/18 em 7/18 em 9/18 em 18/18 em Unicode U+200B U+200A U+200A U+200A U+2006 U+205F U+2005 U+2004 U+2004 U+200A U+2002 U+2003 U+00A0 MathML name zero-width space veryverythinmathspace verythinmathspace thinmathspace mediummathspace thickmathspace verythickmathspace veryverythickmathspace ensp emsp no-break space \ensp \emsp \nbsp \thinsp \medsp \thicksp \vthicksp Autocorrect \zwsp \hairsp In general. U+200B. The em space is given by U+2003. The no-break space (U+00A0) is used when two words need to be separated by a blank. but remain on the same line together.

). For example. \zwsp) is handy for use as a null argument.16 Ordinary Text Inside Math Zones Sometimes one wants ordinary text inside a function argument or in a math zone as in the formula distance rate = time . the the binary property in this context. To prevent this kerning. while the + has + 𝑏 . not math text. in the expression 𝑎 letters a and b have the ord (ordinary) property. nary operators. subformulas (e. which then displays unkerned as 𝒱 .g. 170 of The TeXbook and Appendix F of the MathML 2. the text can be enclosed inside ASCII double quotes. To embed such text inside functions or in general in a math zone.0 specification) ord 0 0 4 5 0 0 3 unary 0 0 4 5 0 0 3 binary 4 4 0 0 0 4 0 rel 5 0 0 0 0 5 3 open 0 0 4 5 0 0 3 close 0 0 0 0 0 0 3 punct 0 0 0 0 0 0 3 ord unary binary rel open close punct For the combinations described by this simple table. The zero-width space (U+200B. For example. expression with an over brace). most spacing is automatically implied by the properties of the characters. Accordingly for the text level there is 4/18th em between the a and the + and between the + and the b. all script-level spacings are 0. clause separators. A more table could include properties like math functions (trigonometric functions. So the formula above would read in linear format as 20 Unicode Technical Note 28 . the expression 𝒱 the subscript 𝑎 shows 𝑏 automatically kerned in 𝑎 𝑏 under the overhang of the 𝒱 . ellipsis. one can insert a \zwsp before the subscript. etc.g. Similarly there is 5/18th em between the = and the surrounding letters in the equation 𝑎 = 𝑏 complete . binary with no spacing (e. but a more complete table would have some nonzero values. For such cases. factorial. differentials. and invisible function apply.. /). 𝑎 𝑏 3. The following table shows examples of how many 1/18ths of an em size are automatically inserted between a character with the row property followed by a character with the column property for text-level expressions (see also p. tall delimiters.Unicode Nearly Plain Text Encoding of Mathematics In math zones.. the alphabetic characters should not be converted to math alphabetic characters and the typography should be that of ordinary text.

T “r xeAb = ½𝑒 o cd t i C g.c. In the linear format. Instead if you paste a math object or text into an ordinary text region. Another example is 𝑖 𝜃 sin 𝜃+ c. The elimination of outermost parentheses for arguments of fractions. i. subscripts.e l z p l e O tl e el th b d i c de c ter them. six special cases are defined as in the following table Autocorrect \phantom \hphantom \vphantom \smash \asmash \dsmash LF op ⟡ U+27E1 U+2B04 ⇳ U+21F3 ⬈ U+2B0D ⬆ U+2B06 ⬇ U+2B07 Op name width ascent descent ink white concave-sided diamond w a d no white left-right arrow w 0 0 no white up-down arrow a d 0 no black up-down arrow w 0 0 yes black up arrow w d 0 yes black down arrow w a 0 yes The general case is given by \phantom(n&<operand>). you split the region into two such regions with the math object and/or text in between. a n id t” r c t o ta y l h u hs t s Sl e oe c n e wI o e q h e e a rh s ue c i n i a p or ’ t do v a tw s a t s s e e i wl h w o a s s i i e im f. Alternatively ordinary text inside a math zone can be specified using a character-format property.17 Phantoms and Smashes Sometimes one wants to obtain horizontal and/or vertical spacings that differ from the normal values. Unicode Technical Note 28 21 .18 Arbitrary Groupings The left/right white lenticular brackets〖 and 〗(U+3016 and U+3017) can be used to delimit an arbitrary expression without displaying these brackets on build up. 3. In [La]TeX this can be accomplished using phantoms to introduce extra space or smashes to zero out space. Note that no math object or math text can be nested inside an ordinary text region.Unicode Nearly Plain Text Encoding of Mathematics "rate"="distance"/"time". If you want to include a double quote inside such text. where n is any combination of the following flags: fPhantomShow fPhantomZeroWidth fPhantomZeroAscent fPhantomZeroDescent fPhantomTransparent 1 2 4 8 16 3. This property is exported to plain text started and ended with the ASCII double quote. t I e . insert \".

21 Equation Numbers Equation numbers are often used with equations presented in display mode. For example █ (E=mc^2#(30)) or more simply just E=mc^2#(30) renders as 2 (30) 𝐸 = 𝑚 𝑐 22 Unicode Technical Note 28 .o t n m sr iHn h d a. r aim including a possible l It r n a a f o a ’t y a f gt h m n tr i z oy ha n o d pep a n eap s h e terminating period. with an implied spacer at the start of the line. The linear format uses U+23A8 to start a math zone and U+23AC to end it. 3. these brackets should be displayed using a math font rather than an East Asian font.19 Equation Arrays To align one equation relative to another vertically. the user can execute a command to build up math zones defined by these math-zone delimiters. i (e a n Tr useful convention for syse es mX s o’ $ d $$ e s ) u tems that mark math zones is that a paragraph consisting of a math zone is in disp . When importing plain text. and would not ordinarily be used in documents. enter the equation followed by a # (U+0023) followed by the desired equation number text. To represent an equation number flushed right of the equation in the linear format. such as 10𝑥 2 + 3𝑦 = 3𝑥 4 + 13𝑦 = which has the linear format █ (10&x+&3&y=2@3&x+&13&y=4). they are nb i w r o n s ol f s af e i t . They are designed for communication between display drivers and display hardware when building up large curly braces. respectively.20 Math Zones Section 5 discusses heuristic methods to identify the start and end of math zones in plain text. 3. 3. While the approaches given are surprisingly successful. where █ is U+2588. then inline rendering is used. These are the left and right curly bracket middle pieces. but the white lenticular brackets can handle any remaining cases. Note that although t w h a e y r t e o ’ s n o specify display versus inls v . one can use an equation array. Here the meaning of the ampersands alternate between align and spacer. So every odd & is an alignment point and every even & is a place where space may be added to align the equations.Unicode Nearly Plain Text Encoding of Mathematics and superscripts solves such grouping problems nicely in most cases.a n ee e etib f n k s nh ’l a c n t d z dt l e oa l i e t oe t e s o preserve this information in the linear format. Note that in math zones. This convention is used in AmSTeX.

1 Table 3.i ! e ac .1 List of the most common operators ordered by increasing precedence CR ( [ { |├〖 ) ] } | ┤〗 &│ Space “ * .14 gives a generalization of this last case). While such notions are very familiar to mathematically oriented people.2 and 3. 2-3. period and comma are treated as operators with the same precedence as plus. To reveal which characters are operators. 3. Operator Summary.1. Most notably. Operands in subscripts.Unicode Nearly Plain Text Encoding of Mathematics 3. the space (U+0020) is an important operator in the plain-text encoding of mathematics since it can be used to terminate operands as discussed in Sec. A small but common list of operators is given in Table 3. 2) the bracket characters described in Sec. some of the symbols that we define as operators might surprise one at first. etc. operator-aware editors could be instructed to display operators with a different color or some other attribute. . In other contexts.1— 3. operands include bracketed expressions and mixtures of such expressions and other operand characters. Unicode minus (U+2212) or plus if either starts a sub/superscript operand. fractions. roots.22 Linear Format Characters and Operands The linear format divides the roughly 100. Operand characters include some nonalp i r i )t ( d hc s n . are defined in part in terms of operators and operator precedence.000 assigned Unicode characters into three categories: 1) operand characters such as alphanumerics. superscripts. f e n id nh s i x p pb ua u n c o ry mr c i l i e a ec a (m n c n rt s ∞ a t e a ht a e y o) f operand. ad m ’n S l n a aed Cl d n is dI pd f u b o e c tr y r ASCII) digits r oh r Af i meo o yu r eI ( u -width (Sec.2— 2. and 3) other operator characters such as those described in Secs. 2. 3. In addition. Hence f(x) can be an operand. More specific definitions of operands are given in the linear-format syntax of Appendix A. boxes.19. = −• + × · ▒ / ¦ ∫ ∑ ∏ _^ □▭ ▁ √ √∙ ¯ Combining marks Unicode Technical Note 28 23 .

Sub/superscript expressions are terminated. So typically the first appearance is considered to be an open |. exp1¦exp2 base^exp1 base_exp1 base_exp1^exp2 (_exp1^exp2)base base┴exp1 24 Unicode Technical Note 28 . Similarly the plain-text expression α + β / γ means 𝛽 𝛽 𝛼 + 𝛼 + not 𝛾 𝛾 Precedence can be overruled using parentheses.5. Numerator and denominator expressions are terminated by operators such as /*]) and blank (can be overruled by enclosing in parentheses). Prescript the subscript exp1 and superscript exp2 to the base base. Sub/superscript operators associate right to left. Subscript expression exp1 to the base base. The subscripts 0 –exist as Unicode sym9+-() bols. We tried using the ASCII U+007C for this too. which streamlines the interpretation of operands. so (α + the latter. not 2. 3. exp1/exp2 Create a built-up fraction with numerator exp1 and denominator exp2. The vertical bar appearing on the same level as & is considered to be a vertical bar separator and is given by the box drawings light vertical character (U+2502).( ) exist as Unicode symbols. Subscript expression exp1 and superscript expression exp2 to the base base. Similar to fraction. β )/γ gives The following gives a list of the syntax for a variety of mathematical constructs (see Appendix A for a more complete grammar). but the resulting ambiguities were insurmountable. The operators are grouped above in order of increasing precedence. in arithmetic. the next an open |. operators have precedence. Sometimes called a stack. but no fraction bar is displayed. For example. with equal precedence values on the same line. As in arithmetic. The choice is disambiguated by the evenness of its count at any given bracket nesting level or other considerations (see Sec. Above/below script operators associate right to left. by /*]) and blank. The subscripts 0 – 9 + . Display expression exp1 centered above the base base. 3+1/2 = 3.Unicode Nearly Plain Text Encoding of Mathematics where CR = U+000D. The super9+-() scripts 0 –exist as Unicode symbols. Superscript expression exp1 to the base base. Note that the ASCII vertical bar | (U+007C) shows up both as an opening bracket and as a closing bracket. and so forth.1). for example. the next a close |.

∏ 1^exp2▒ Product from exp1 to exp2 with multiplicand exp3.. 2-3. a rp t s t it s’t x s l n p he n te rands may be used to space formulas in addition to the built-in spacing provided by a math display engine. for example. expn-1 [& expn] … ]) Note that U plethora of mathematical operators2 fill out the capabilities of n i c o d e ’ s the approach in representing mathematical expressions in the linear format. See Sec. Surround exp1 with built-up brackets. For such operands. but may need to be overruled. Blanks are discussed in greater detail in Sec. Overbar above exp1. exp1) that would otherwise combine with the following operand. Similarly for { }. These Unicode Technical Note 28 25 . Similarly for { } and ( ). ∫_exp1^exp2▒ Integral from exp1 to exp2 with integrand exp3. _exp1 and exp3 ^exp2 are optional. ∑_exp1^exp2▒ Summation from exp1 to exp2 with summand exp3. Cube root of exp1. Abstract box around exp1.Unicode Nearly Plain Text Encoding of Mathematics base┬exp1 [exp1] [exp1]^exp2 □exp1 ▭ exp1 ▁ exp1 ¯ exp1 √exp1 ∙ exp1 √ exp1 √(exp1&exp2) Display expression exp1 centered below the base base. To terminate an operand (shown above as. _exp1 and _exp exp3 ^exp2 are optional. Fourth root of exp1. not the ASCII underline character U+005F). insert a blank (U+0020). Surround exp1 with built-up brackets followed by superscripted exp2 (moved up high enough). (exp1 [& exp2] … exp1 over exp n-1. Rectangle around exp1. exp1th root of exp2. to build up an array (see Appendix [@ Align … A for complete syntax). _exp1 and ^exp2 exp3 are optional.1 for generalizations. This blank d hh ps B tmeo oe r b l d i e wn e ua on s uhsi kna n peo u t t o o we i. To form a compound operand. the outermost parentheses are removed. Square root of exp1. parentheses can be used as described for the fraction above. | |. etc. 3. Precedence simplifies the text representing formulas. Underbar under exp1 (underbar operator is U+2581. ( ).

while explicit multiplication by asterisk and raised dot has a precedence equal to that of plus.23 Math Features Not In Linear Format A number of math features have been reserved for a higher level instead of being included in the linear format. but in math zones the minus sign and prime can be entered using these ASCII counterparts. 3.g. The ASCII math symbols are easy to find. Note that a particularly valuable use of the nearly plain-text math format in general is for inputting formulas into technical documents or programs. 3. e. Math features currently missing from the linear format include: Arbitrary position tweaks of built-up functions and their arguments (spacing tweaks and phantoms are included: see Secs. Hence to obtain a full featured mathematical representation with the linear format requires that the linear format be embedded in an appropriate rich-text environment. + .1 ASCII Character Translations From syntax and typographical points of view. it differs occasionally from the latter. Note that for proper typography. The primes in most fonts are chosen to look approximately 26 Unicode Technical Note 28 .Unicode Nearly Plain Text Encoding of Mathematics operands occur for fraction numerators and denominators. So even though the analysis is similar to that for arithmetic expressions. and arguments of functions like square root. To handle these cases and to provide convenient entry of many other symbols. the prime should have a large glyph variant that when superscripted looks correct. This compromise is made for the sake of readability of the linear format.15 and 3. but often need to be used as themselves. it is useful to give some discussion of input methods. In contrast. the backslash (\).17) User-defined soft breaks (U+00AD and U+000B?) 4./ * [ ] ( ) { }. Input Methods In view of the large number of characters used in mathematics. subscript and superscript expressions. the direct input of tagged formats like MathML is very cumbersome. one can use an escape character. followed by the desired operator or its autocorrect name. In addition rich-text properties are missing such as text and background color and font characteristics other than the standard Unicode math styles. the Unicode minus sign (U+2212) is displayed instead of the ASCII hyphen-minus (U+002D) and the prime (U+2032) is used instead of the ASCII apostrophe (U+0027).. 4. Parentheses appearing in other contexts are always displayed in built-up format. A curious aspect of the notation is that implied multiplication by juxtaposing two variable letters has very high precedence (just below that of diacritics).

Unicode Nearly Plain Text Encoding of Mathematics le u nern er i r t ’ td p t k s t p es l o e c h r dz c m a ie o e am e s t d is d n e u .2 Math Keyboards A special math shift facility for keyboard entry could bring up proper math symbols. since expressions like x < h a op u i l n d t n o ’ t − b are common. such as sub/superscript digits using the left Alt key.e.o i a a n d ! it min o e = ne asf t a icn a t t ie e m g sf n t a c torial. Left Alt CapsLock could lock into the Unicode Technical Note 28 27 . As such they are not e t “ a s n i ” n should ” a italicized and are rendered with normal typography.. Ctrl+z. Also <. but when used as mathematical variables. Accordingly a user might want to make italic the default alphabet in a math context. reserving the right to overrule this default when necessary. you can undo the transon n automatic ’ t i lation by typing. If yl ok ue da translation when entering math. A more elegant approach in math zones is to translate letters deemed to be standalone to the appropriate math alphabetic characters (in the range U+1D400– or in the Letterlike Block U+1D7FF U+2100— Letter combinations corresponding to standard function names U+213F). 4.sm←. the left Alt key could access the most common mathematical characters and Greek letters. For example. Suffice it to say that intelligent input algorithms can dramatically simplify the entry of mathematical symbols and expressions. for example. i. Other post-entry enhancements include mappings like !! +-+ :: := <= >= << >> /= ~= -> ‼ U+203C ± U+00B1 ∓ U+2213 ∷ U+2237 ≔ U+2254 ≤ U+2264 ≥ U+2265 ≪ U+226A ≫ U+226B ≠ U+2260 ≅ U+2245 → U+2192 I oone u t t ct o mo! s h o ’ d a≠s d m e s i p. and the right Ctrl key could access script characters and other mathematical symbols. such letters are traditionally italicized in print. not mathematical typography. the right Alt key could access italic characters plus a variety of arrows. ln i d k “ be represented by ASCII alphabetics. Similarly it is easier to type ASCII letters than italic letters. The numeric keypad offers locations for a variety of symbols. od i p b t h ia r y v p e ee t g well with other superscripts. The values chosen can be displayed on an on-screen keyboard.

select the desired code so that tddh al h ha h iea r de en e g ia ne c c p h ae i ned r e lr n t T c e x c e do o c a r ’ i . This approach is noticeably faster than using menus and is particularly attractive to those with some familiarity with TeX. or directly into applications. which is the highest character in the 17 planes of Unicode.3 Hexadecimal Input A handy hex-to-Unicode entry method can be used to insert Unicode characters in general an hil l e c d a p. that is. Toggling back to the hex code is v lih a t i ni e f g t t es ’ r o oa r gl m y r u c il da u f t hsy ok si wa i h ee eg fu a c h t t ur n rf e p s t e f clear or for looking up the character properties in the Unicode Standard.e e nc c t m t sc u d range up to the value 0x10FFFF.Unicode Nearly Plain Text Encoding of Mathematics left-Alt symbol set. 4. typing \alpha inserts α the appropriate autocorrect entry if is present. The Alt+x is a toggle. several orders of magnitude more than Unicode can handle! 4. Toolbars. etc. making corrections as need be. that is. 4. Pretty soon one realizes that this approach rapidly approaches literally billions of combinations. Rightmouse menus are quite useful since they provide easy access to context-sensitive options. Right-Mouse Menus Pull-down menus and toolbars are popular methods for handling large character sets. If the hex code is preceded by one or more hexadecimal digits. On-screen keyboards and symbol boxes are valuable for entry of mathematical expressions and of Unicode text in general. Other possibilities involve the NumLock and ScrollLock keys in combinations with the left/right Ctrl/Alt keys. 28 Unicode Technical Note 28 . This approach yields what one might call a “ sticky” shift. type it once to convert a hex code to a character and type it again to convert the character back to a hex code. o a r m a aB n cs a tra e h tet s t a h r ci y r c na l s t r c i c pa su a ry e ’ hexadecimal code (in ASCII). The hexadecimal code is replaced by the corresponding Unicode character.5 Macros The autocorrect and keyboard macro features of some word processing systems provide other ways of entering mathematical characters for people familiar with TeX.4 Pull-Down Menus. which is an array of symbols either chosen by the user or displaying the characters in a font. and then types Alt+x. A related approach is the symbol box. Symbols in symbol boxes can be dragged and dropped onto key combinations on the on-screen keyboard(s). For example. but they tend to be slower than keyboard approaches if you know the right keys to type. Multiple tabs can organize the symbol selections according to subject matter.

These are Unicode Technical Note 28 29 . such as \a for α . This section describes some of the heuristics that can distill the structure out of the plain text.7 Handwritten Input Particularly for PDAs and Tablet PCs. Recognizing Mathematical Expressions Plain-text linearly formatted mis b s aai e i tl n us h e ss f ex c d r mp a “ a en a t s c o r eo ” simple documentation purposes.a s a t ib tl ’ oho g e u n mt c l h z -up mathematical expressions directly. wcs c s wo i oa p si r pse s r lan r dt t fo d e o e be r e r b a s y l a e oi c i pi t r gt h r a e a ni . 4. handwritten input is attractive provided t r g biu dFph i n l he r t e t ie er i h hn z t r’ t i ag e o t hn s n rr d h ag a d e i e e n. This also allows localization of the control word list.6 Linear Format Math Autocorrect List The linear format math autocorrect list includes most of those defined in Appendix F of The TeXbook. plus a number of others useful for inputting the linear format as shown in the following table Control word \int \prod \funcapply \rect \open \above \underbar \underbrace \begin \phantom \hphantom \asmash \matrix Character Control word ∫) ( \sum U + 2 2 2 B ∏F ( \naryand U + 2 2 0 ) (U+2061) \of ▭ (U+25AD) \box ├ (U+251C) \close ┴ (U+2534) \below ▁ (U+2581) \overbar ︸(U+23DF) \overbrace 〖 (U+3016) \end ⟡ (U+27E1) \smash (U+2B04) \vphantom ⬆ (U+2B06) \dsmash ■ (U+25A0) \eqarray Character ∑1 ( U + 2 2 1 ) ▒ (U+2592) ▒ (U+2592) □ (U+25A1) ┤ (U+2524) ┬ (U+252C) ¯ (U+00AF) ︷(U+23DE) 〗 (U+3017) ⬈ (U+2B0D) ⇳ (U+21F3) ⬇ (U+2B07) █ (U+2588) Appendix B contains a default set of keywords containing both The TeXbook keywords and the linear-format keywords Users can define their own control words for convenience or preference. Use in more elegant documentation and in programming languages requires knowledge of the underlying mathematical structure. which requires less typing than the official TeX control word \alpha. like \alpha for α . Note that if explicit math-zone-on and math-zone-off characters are desired.20 specifies that U+23A8 starts a math zone and U+23AC ends it. 5.Unicode Nearly Plain Text Encoding of Mathematics 4. Sec. 2.

Unicode Nearly Plain Text Encoding of Mathematics the left and right curly bracket middle pieces. but that are not e iI ebe many error messages n tf s y o ch o o m n l in u i g on e ts e s $ l at t e ’e $ k s dsa w. and as such should be italicized in print. obviating the need to declare them explicitly as such. If they are not. Many mathematical expressions identify themselves as mathematical. identify the characters immediately surrounding them as parts of math expressions. Resyncing is automatic. but they lead to the most popular choices. Special commands discussed at the end of this section can be used to overrule these choices. the symbol combining marks (U+20D0 – the math alphanumerics U+20FF). symbols in the range U+2200 through (U+2044) U+22FF. (U+1D400 – U+1D7FF). If a letter pair fails to appear in a list of common English and European two-letter words. one can still have some recognition heuristics as well as the opportunity to italicize appropriate variables. respectively. It is possible to use a number of heuristics for identifying mathematical expressions and treating them accordingly. The basic idea is that math characters identify themselves as such and potentially identify their surrounding characters as math characters as well. Specifically ASCII letter pairs surrounded by whitespace are often mathematical expressions. while in TeX one basically has to start up again from the omission in question. A math style would connect in a straightforward way to appropriate MathML tags. An advantage of recognizing mathematical expressions without math-on and math-off syntax is that it is much more tolerant to user errors of this sort. Furthermore. and in general. into that form. The user could then override cases that were tagged incorrectly. the fraction ⁄ and ASCII slashes. they are considered parts of math expressions. Unicode characters with the mathematics property. e because TeX interprets subsequent text in the wrong mode. They are designed to be used by display programs in building up large curly braces. it is treated as a mathematical expression and italicized. Many Unicode characters are not mathematical in nature and suggest that their neighbors are not parts of mathematical expressions. Ultimately the approach could be used as an autoformat style wizard that tags expressions with a rich-text math style whose state is revealed to the user by a toolbar button. Strings of characters containing no whitespace but containing one or more unambiguous mathematical characters are generally treated as mathematical expres30 Unicode Technical Note 28 . v a . These heuristics are not foolproof. One well-known TeX problem is T it ele Xtc ’ t s o i d n e a b i t y expressions that are clearly mathematical. For example. If Latin letter mathematical variables are already given in one of the math alphabets. namely in recognizing and converting the mathematical literature that is not yet available in an object-oriented machine-readable form. and would not ordinarily be used in documents. this approach could be useful in an important related endeavor.

often used as subscripts (see the program in Sec.. r e g e t t i ng In general. modern computer languages are From this badly lacking. Similarly. we start with our Unicode linear format.h I reexamining t ’ s w o this premise. and streamlined documentation. e. mt e e r s nc p k a nin a Beraal dsttb *s aFsaal * r so bk r s k e a n ’ (s a ia sd na ’b t e r s iaak. even when they clearly appear inside mathematical expressions. The literal operator introduced earlier (\) causes the operator that follows it to be treated as an nonbuildup operator. reduced coding times. Using the Linear Format in Programming Languages Id. Certain two-. as well as ln. Arithmetic expressions in Fortran and other current high-level languages still do not look like mathematical formulas and considerable human coding effort is needed to translate formulas into th r creF s t h i e otxou en h us arp ie e n Fpar r cn t ol ’ r m o s e r . cosh.Unicode Nearly Plain Text Encoding of Mathematics sions. and four-letter words inside such expressions should not be italicized. mathematical expressions that the algorithms treat as ordinary text can be sandwiched between math-on and math-off symbols or by an ordinary text attribute if they need to be embedded in the math zone.s i ami r c pl p b a e . Whereas true mathematical notation clearly used to be beyond the capabilities of machine recognition. t c sd r r ’ e a p u )e 7“ trand Russell once said a good notation has a subtlety and suggestiveness which at t e ae ad t w a i im a han o m t o le p tu e s si r ea l s e t e afi b m e l t n cn e am i e k l k c v … rtd eo o substitute for thought. such as in documenting the syntax itself. etc. 6. we’ a lot closer now. In studying computers we have been taught that this ideal is unattainable. also should not be italicized. Unicode Technical Note 28 31 . Although ultimately it would be desirable to be able to teach computers how to understand all mathematical expressions.” point of view. three-. mathematics has a very wide variety of notations. none of which look like the arithmetic expressions of programming languages. Words or abbreviations. but they only went part way.g. r R d pu n lt s A t u a t 1eo N h tg h 9 a F n ie e e 5u O ar r m 0t R m c l i ’ o T e mn ds eh f h e oa g after FORmula TRANslation. Using real mathematical expressions in computer programs would be far superior in terms of readability. These include trigonometric function names like sin and cos. and that one must be content with the arithmetic expression as it is or some other non-mathematical notation such as TeX’r st . program maintenance. in the numerator of a fraction. This allows the printing of characters without modification that by default are considered to be mathematical and thereby subject to a changed display. 6). Special cases will always be needed.

Such machine comprehension would greatly facilitate future computations as well as the conversion of the existing paper literature and hand written input into machine usable form. they can be stored in pure-ASCII program files accepted by standard compilers and symbolic manipulation programs like Maple. instead of the asterisk. The concept is portable to any environment that supports Unicode. this enlightened choice of notation can immediately change nearly useless or nonexistent documentation into excellent documentation. script. 3) In addition to providing useful tools for the present. and Macsyma.1 Advantages of Linear Format in Programs In raw form. Note that this is a goal of MathML as well. Two levels of implementation are envisaged: scalar and vector. 2) The use of the same notation in programs and the associated journal articles and books leads to an unprecedented level of self documentation. italics. which is a legitimate mathematical 32 Unicode Technical Note 28 .Unicode Nearly Plain Text Encoding of Mathematics 6. and it takes advantage of the fact that high-l a a nn e g n a“ v e d ce e s F cs l l oec l i rpa a k tt e n e r a” g C a u p c( ”vc ds st h“ . they can be printed or displayed in traditional built-up form. To keep auxiliary code to a minimum. With use of the heuristics described above. but attained in a relatively complex way using specialized tools. ea t x m a ”r ln o t b r a s)b ae o a n p te cn l c d eh uc d s t “ t t es d e e $i r _ ey ca se e e ys in a fashion similar to TeX. these expressions look very like traditional mathematical expressions. either numerically or analytically. This dramatically reduces coding time and errors. In addition. the vector implementation requires an object-oriented language such as C++. The advantages of using the plain-text linear format are at least threefold: 1) many formulas in document files can be programmed simply by copying them into a program file and inserting appropriate multiplication dots. since many programmers document their programs poorly or not at all. The scalar multiply operator is represented by a raised dot. a legitimate mathematical symbol. Hence formulas can be at once printable in manuscripts and computable. Scalar operations can be performed on traditional compilers such as those for C and Fortran. On disk. the built-in C preprocessor allows niceties such as aliasing the asterisk ith a raised dot. and various mathematical symbols like the square root. The translation between Unicode symbols and the ASCII names needed by ASCII-based compilers and symbolic manipulation programs can be carried out via table-lookup (on writing to disk) and hashing (on reading from disk) techniques. The idea here is that regular programming languages can have expressions containing standard arithmetic operations and special characters. these proposed initial steps should help us figure out how to accomplish the ultimate goal of teaching computers to understand and use arbitrary mathematical expressions. In fact. Mathematica. such as Greek.

alphacoh = 0. The Java and C# languages allow direct use of Unicode variable names. Delta).01) alphacoh = -half*alpha0*I2*pow(gamma/gammap. } alpha1 = alphainc + alphacoh. betap2 = upsilon*(upsilon + gamma*I2sF). 6. 3). beta = sqrt(betap2).2 Comparison of Programming Notations To get an idea as to the differences between the standard way of programming mathematical formulas and the proposed way. else { Gamma = 1/T1 + gamma1.(1+gamma/gammap)*(gammap – upsilon)/ (gammap + upsilon)). alphainc = alpha0*(1-(gamma*gamma*I2/gammap)/(gammap + upsilon)). upsilon = cmplx(gamma+gamma1. I2sF = (I2/T1)/cmplx(Gamma. Delta). } Unicode Technical Note 28 33 . if (!gamma1 && fabs(Delta*T1) < 0. which is a major step in the right direction.5*gamma*alpha0*(I2sF*(gamma + upsilon) /(gammap*gammap – betap2)) *((1+gamma/beta)*(beta – upsilon)/(beta + upsilon) . compare the following versions of a C++ routine entitled IHBMWM (inhomogeneously broadened multiwave mixing) void IHBMWM(void) { gammap = gamma*sqrt(1 + I2).Unicode Nearly Plain Text Encoding of Mathematics symbol for multiplication. Compatibility with unenlightened ASCII-only compilers can be done via an ASCII representation of Unicode characters.

vectors can be m t s ro uhy uo i ou ce o lgn st t r n t e g se s e e i t d.01) 1) < else { 3 𝛼 • 𝛾 . ’ a p h oa pbs s l e t d o tn t i r . 0 •’ 2• Γ 𝛾 = 1/𝑇 1 + 1. In built-up form. but C++ does impose some serious restrictions based on its limited operator table.5 •pow(𝛾 _coh = −𝐼 /𝛾 𝛼 . e uc d n ru o ed tl risk to overload in C++. 2ℱ 2 /𝑇 • 2 𝛽 + 𝛾 = 𝜐• •𝐼 (𝜐 ).r . 2ℱ 2 𝛽 = 𝛽 . = 𝛾 • + 𝛾 Δ 1 + 𝑖 𝛼0 • 𝛾 ’ _inc = 𝛼 • (1 −𝐼+ 𝜐 (𝛾 ’ • )/(𝛾 2 /𝛾 )). 𝐼 1 )/(Γ = (𝐼+ 𝑖 Δ ). inc = 𝛼 𝛾 0 ’ if (! 𝛾 • 𝑇 1 || fabs(Δ 0.5 𝛼 /𝛾 coh = −𝐼 . For example. 1 coh = . namely void IHBMWM(void) { 𝛾 + 𝐼 = 𝛾2 .5 • ’ 0• ′ 𝛾 • ’ 𝛽 𝛽 +𝜐 𝛾 ’ +𝜐 𝐼 ℱ 𝛾 +𝜐 𝛾 𝛽 − 𝜐 𝛾𝜐 𝛾 ’ − 34 Unicode Technical Note 28 . =Γ 2ℱ +𝑖 • Δ 2 𝛽 + 𝛾 = 𝜐• •𝐼 (𝜐 ).Unicode Nearly Plain Text Encoding of Mathematics void IHBMWM(void) { 𝛾 + 𝐼 = 𝛾 2 ). 𝛽 = 𝛽 2 𝛼 𝛾 𝛾 1 + 𝛽 − • •− 𝛼 2 • + 𝛾 . 3). (1 )• ’ (𝛾 + ’ } 𝛼 𝛼 1 = 𝛼 . if (! 𝛾 • 𝑇 1 || fabs(Δ 0. 𝐼 2 /𝑇 𝐼1 . 2 𝛼 𝛾 ℱ ’ = .5 • 2 (𝛾 • • + 𝜐 − 𝛼 (𝐼 𝛾 ’ 𝛽 coh 0 • )/(𝛾 )) × ((1 + 𝛾− 𝜐𝛾 − 𝜐 /𝛽𝜐) − ’ )/(𝛾 )• + (𝛽 )/(𝛽 + /𝛾 𝜐)). 2ℱ 2. inc + coh } The above function runs fine with current C++ compilers. • 1 𝜐 + 𝑖 = 𝛾 • + 𝛾 Δ . the function looks even more like mathematics. 1 𝛾 • 𝛾 • 𝐼 ’ 2 /𝛾 𝛼•+𝜐 1 −. • (1 𝜐 .01) 1) < 𝛼 . 0• 2 •’ else { 𝛤 𝛾 = 1/𝑇 1 + 1.

and the output is displayed in standard mathematical notation. So some refinement of the ways that comments are handled would be very desirable. To code a formula. Questions remain such as to whether subscript expressions in the Unicode plain text should be treated as part of program-variable names. at least. it takes time and is restricted to a binary subset of the rationals with limited (although usually adequate) precision. or whether they should be translated to subscript expressions in the target programming language. With current programs. just as adjacent ASCII letters are part of a variable name in current programming languages. Perhaps intelligent algorithms will be developed that decide when multiplication should be performed and insert the asterisks optimally. the C++ function on the preceding page compiles without modification. and large parentheses. complete with fraction bars. For example. This same syntax can be used in the linear format. while numerically. but has a modified set of requirements. comments are distilled out with distinct syntax. it would be straightforward to automatically insert an asterisk (indicating multiplication) between adjacent symbols. inserts appropriate raised dots for multiplication and compiles. Similarly. it seems wiser to consider adjacent symbols as part of a single variable name. multiplication is infinitely precise and infinitely fast. compiler comment syntax is not particularly pretty. 5 and 8 of Meystre and Sargent9]. No change of variable names is needed. inc + coh } The ability to use the second and third versions of the function was built into the Word Processor8 circa 1988.Unicode Nearly Plain Text Encoding of Mathematics } 𝛼 𝛼 1 = 𝛼 . However here there is a major difference between mathematics and computation: symbolically. pastes it into a program file. In this connection. With it we already came much closer to true formula translation on input. 6. square roots. The code appears nearly the same as the formulas in print [see Chaps. Call that 70% of true formula translation! In this way. Consequently for the moment. rather than have the user do it. ruled boxes around comments and vertical dividing lines between code and comments are noticeably more readable.3 Export to TeX Export to TeX is similar to export to programming languages. although it is interesting to think about submitting a mathematical document to a preprocessor that can recognize and separate out programs for a compiler. one copies it from a technical document. it would be nice to have a vertical window-pane facility with synchronous window-pane scrolling and the ability to display C code in the left pane and the corresponding // comments in the right pane. Then if one wants to see the Unicode Technical Note 28 35 . Lines of code can be previewed in built-up format.

An effective use of the heuristics would be by an autoformatting wizard that delimits what it thinks are math zones with on/off codes or a cha36 Unicode Technical Note 28 . On the other hand. A simple operand consists of a span of non-operators. TeX should be recompiled to use Unicode. The heuristics described above go a long way in determining what is mathematics and what is natural language. The heuristics given for recognizing mathematical expressions work well. mathematical expressions can usually be represented with a readable Unicode nearly plain-text format. and numerical computation. Conclusions We have shown how with a few additions to Unicode. The choices made in this paper favor efficient input of mathematical formulae. es u oau d a e t i . sufficient generality to support high-quality mathematical typography. and a format that resembles a real mathematical notation. The text consists of combinations of operators and operands. It would be nice to have dedicated Unicode characters for this purpose particularly when the math zones have been reliably determined. A variety of syntax choices could be used for a linear format. and identifiers. the export method consists of identifying the mathematical expres etsc l ss nh . as for the program translations. Accordingly. a definition that substantially reduces the number of parenthesis-override pairs and thereby increases the readability of the plain text. Alternatively one can use LaT open/close approach. md t yr t t h s se nh With TeX. but they are not infallible. Obviously compromises between these goals had to be made. operators have precedence values that control the association of operands with operators unless overruled by parentheses. In addition one needs to tag numbers. one widens the right pane accordingly. but we have repeatedly pointed out how ugly this solution is. including in technical document preparation particularly for input purposes. 7.nd o n do s T b d c d s dh u $i w cl n e in the plain text. the text surrounding the mathematics is part and parcel of the technm T i i th e i ee t s h e i c nXs t e s n a t n ’ gte c l . e X ’ s \[ … \] Export to MathML also requires knowing the start and end of a math zone.i s i ceT la o lm h sr n o i sm t s s n p ba a i $ e on nn ’ a e dg ey r lated to and from the standard TeX ASCII names via table lookup and hashing. Heuristics can be applied to Unicode plain text to recognize what parts of a document are mathematical expressions. which we call the linear format. The built-up functions all have corresponding MathML entities. symbolic manipulation. ac e m en n e . to view lines w c o e e ’e i h f /n t w t a c /t g a h r o c nt y m a d o ei.Unicode Nearly Plain Text Encoding of Mathematics comments. To simplify the notation. Better yet. the ability to round trip elegant mathematical text at least in a rich-text environment. This allows the Unicode plain text to be used in a variety of ways. operators.

José Oglesby. Once the math zones are identified unequivocally. Richard Lawrence. Sergey Genkin. The user could then overrule any incorrect choices. Isao Yamauchi. ’ Unicode math horizontal bracket Unicode integrals. Appendix A. Jason Rajtar. char space αC AI SI nASCII αa nt Mh α tr nh Oe α n diacritic opArray opClose opCloser opDecimal opHbracket opNary opOpen opOpener opOver opBuildup other diacriticbase ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← Unicode character ASCII space (U+0020) ASCII A-Z a-z ASCII 0-9 Unicode math alphanumeric (U+1D400 – with some U+1D7FF Letterlike symbols U+2102 – U+2134) Unicode alphanumeric not including α a nor nASCII nt Mh α a | α tr n t nh M h Oe Unicode combining mark ‘ & ’ | V T | ‘ ■’ ‘| )‘ ’⌍ |’ ‘ ] ’ | ‘ } ’ opClose | “ \c l o s e ” ‘ . and other nary ops ‘| (‘ ’⌌ |’ ‘ [ ’ | ‘ { ’ opOpen | “ \o p e n ” ‘” / ’ | “ \a t o p ‘’ ‘ ’ | opOpen | opClose | _ | □’opArray ’‘ ‘ |∙ | ‘’ ‘ ^| / ’ √’ | ’| ‘ √ |‘ | | opNary | opOver | opHbracket | opDecimal char – {α+ nASCII + diacritic + opBuildup + CR} n α| nASCII | n ‘ ( ’ exp ‘ ) ’ Unicode Technical Note 28 37 . 10.Unicode Nearly Plain Text Encoding of Mathematics racter-format attribute. Donald Knuth. compilers. Jennifer Michelstein. ’ | ‘ . summation. Victor Kozyrev. export to MathML. Barbara Beeton. Ron Whitney. Yuriko Rosnow. Andrei Burago. Yi Zhang. Linear Format Grammar This grammar is somewhat simplified compared to the model in the text. Jinsong Yu. product. Ken Whistler. and other consumers of mathematical expressions is straightforward. and Eliyezer Kohen. Acknowledgements This work has benefitted from discussions with many people. Geraldine Wade. Earlier related work is listed in Ref. notably PS Technical Word Processor users. Ethan Bernstein. Said Abou-Hallawa. Ross Mills. Asmus Freytag. Sergey Malkin.

Unicode Nearly Plain Text Encoding of Mathematics diacritics atom atoms digits number expBracket ← ← ← ← ← ← ← ← ← ← ← ← diacritic | diacritics diacritic α| diacriticbase diacritics n atom | atoms atom nASCII | digits nASCII digits | digits opDecimal digits opOpener exp opCloser ‘ | | ’ exp ‘ | | ’ ‘ | ’ exp ‘ | ’ α C| w dA I AI o αC SI r SI word | word nASCII | α a | number | other | expBracket | nt Mh opNary operand | ‘ ∞ ’ | | “ ‘ -∞ -’ operand ” scriptbase ‘ ‘ | _ ^ ’ ’ soperand soperand scriptbase ‘ ‘ ^ _ ’ ’ soperand soperand scriptbase ‘ _ ’ soperand scriptbase ‘ ^ ’ soperand expSubsup | expSubscript | expSuperscript atoms | expBracket | number entity | entity ‘ ! | expScript !! ’” || entity “ function factor | operand factor ‘ □’ operand opHbracket operand ‘operand √ ’ ‘ ∙ ’ operand ‘ √ ’ operand “ √(”‘ ‘ operand & ) ’’ operand sqrt | cubert | fourthrt | nthrt | box | hbrack operand | fraction numerator opOver operand exp | row ‘ & ’ exp row | rows ‘ @ ’ row “rows ‘ \a ) r ’ r a y ( ” fraction | operand | array element | exp other element word scriptbase soperand expSubsup expSubscript ← expSuperscript ← expScript ← entity factor operand box hbrack sqrt cubert fourthrt nthrt function numerator fraction row rows array element exp ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← 38 Unicode Technical Note 28 .

their target characters and codes along with spacing and linear-format build-up properties. A full keyword consists of a backslash followed by a keyword in the table.Unicode Nearly Plain Text Encoding of Mathematics Appendix B. Keyword above acute aleph alpha amalg angle aoint approx asmash ast asymp atop Bar bar because begin below beta beth bot bigcap bigcup bigodot bigoplus bigotimes bigsqcup biguplus bigvee bigwedge bowtie Glyph Code ┴ U+2534 ́ U+0301 ℵ U+2135 Α U+03B1 ∐ U+2210 Spacing ordinary ordinary ordinary ordinary ordinary relational ordinary relational ordinary binary relational ordinary ordinary ordinary relational open ordinary ordinary ordinary relational ordinary ordinary ordinary ordinary ordinary ordinary ordinary ordinary ordinary relational LF Property subsup upper accent operand operand nary normal nary normal encl phantom normal normal divide accent accent normal open subsup lower operand operand normal nary nary nary nary nary nary nary nary nary normal ∠ U+2220 ∳ U+2233 ≈ U+2248 ⬆ U+2B06 ∗ U+2217 ≍ U+224D ¦ U+00A6 ̿ U+033F ̅ U+0305 ∵ U+2235 〖 U+3016 ┬ U+252C Β U+03B2 ℶ U+2136 ⊥ U+22A5 ⋂ U+22C2 ⋃ U+22C2 ⨀ U+2A00 ⨁ U+2A01 ⨂ U+2A02 ⨄ U+2A06 ⨃ U+2A04 ⋁ U+22C1 ⋀ U+22C0 ⋈ U+22C8 Unicode Technical Note 28 39 . Character Keywords and Properties The following table gives the default math keywords.

Unicode Nearly Plain Text Encoding of Mathematics bot box bra breve bullet cap cbrt cdot cdots check chi circ close clubsuit coint cong cup daleth dashv Dd dd ddddot dddot ddot ddots degree Delta delta diamond diamondsuit div dot doteq dots Downarrow ⊥ U+22A5 □ U+25A1 U+27E8 ̆ U+0306 ∙ U+2219 ∩ U+2229 ∙ U+221B ⋅ U+22C5 ⋯ U+22EF ̌ U+030C χ U+03C7 ∘ U+2218 ┤ U+2524 ♣U+2663 ∲ U+2232 ≅ U+2245 ∪ U+222A ℸ U+2138 ⊣ U+22A3 ⅅ U+2145 ⅆ U+2146 ⃜ U+20DC relational ordinary open ordinary binary binary open binary ordinary ordinary ordinary binary ordinary ordinary ordinary relational binary ordinary relational differential normal encl box open accent normal normal encl root normal normal accent operand normal close normal nary normal normal operand stretch horz operand operand accent accent accent normal operand operand operand normal normal normal accent normal normal normal differential ordinary U+20DB ordinary ̈ U+0308 ordinary ⋱ U+22F1 relational ° U+00B0 ordinary Δ U+0394 ordinary Δ U+03B4 ordinary ⋄ U+22C4 binary ♢ U+2662 ordinary binary ordinary relational ordinary relational ÷ U+00F7 ̇ U+0307 ≐ U+2250 … U+2026 ⇓ U+21D3 40 Unicode Technical Note 28 .

Unicode Nearly Plain Text Encoding of Mathematics downarrow dsmash ee ell emptyset emsp end ensp epsilon eqarray eqno equiv eta exists forall funcapply Gamma gamma ge geq gets gg gimel grave hairsp hat hbar heartsuit hookleftarrow hookrightarrow hphantom hvec ii iiiint iiint ↓ U+2193 ⬇ U+2B07 ⅇ U+2147 ℓ U+2113 ∅ U+2205 U+2003 〗 U+3017 U+2002 ϵ U+03F5 █ U+2588 # ≡ U+0023 U+2261 Η U+03B7 ∃ U+2203 ∀ U+2200 U+2061 Γ U+0393 Γ U+03B3 ≥ U+2265 ≥ U+2265 ← U+2190 ≫ U+226B ℷ U+2137 ̀ U+0300 U+200A ̂ U+0302 ℏ U+210F ♡ U+2661 ↩ U+21A9 ↪ U+21AA ⬉ U+2B04 ⃑ U+20D1 ⅈ U+2148 2A0C U+2A0C ∭ U+222D relational ordinary ordinary ordinary unary skip close skip ordinary ordinary ordinary relational ordinary unary unary binary ordinary ordinary relational relational ordinary relational ordinary ordinary skip ordinary ordinary ordinary relational relational ordinary ordinary ordinary ordinary ordinary normal encl phantom operand operand operand normal close normal operand encl eqarray marker normal operand normal normal subsupFA operand operand normal normal stretch horiz normal operand accent normal accent operand normal stretch horiz stretch horiz encl phantom accent operand nary nary Unicode Technical Note 28 41 .

Unicode Nearly Plain Text Encoding of Mathematics iint Im imath in inc infty int iota jj jmath kappa ket Lambda lambda langle lbrace lbrack lceil ldiv ldots le Leftarrow leftarrow leftharpoondown leftharpoonup Leftrightarrow leftrightarrow leq lfloor ll Longleftarrow longleftarrow Longleftrightarrow longleftrightarrow Longrightarrow ∬ U+222C ℑ U+2111 ı U+0131 ∈ U+2208 ∆ U+2206 ∞ U+221E ∫ U+222B ι U+03B9 ⅉ U+2149 0237 U+0237 κ U+03BA U+27E9 Λ U+039B λ U+03BB U+27E8 { U+007B [ U+005B ⌈ U+2308 ∕ U+2215 … U+2026 ≤ U+2264 ⇐ U+21D0 ← U+2190 ↽ U+21BD ↼ U+21BC ⇔ U+21D4 ↔ U+2194 ≤ U+2264 ⌊ U+230A ≪ U+226A ⟸ U+27F8 ⟵ U+27F5 ⟺ U+27FA ⟷ U+27F7 ⟹ U+27F9 ordinary ordinary ordinary relational unary ordinary ordinary ordinary ordinary ordinary ordinary close ordinary ordinary open open open open binary ordinary relational relational relational relational relational relational relational relational open relational relational relational relational relational relational nary operand operand normal operand operand nary operand operand operand operand close operand operand open open open open divide normal normal stretch horiz stretch horiz stretch horiz stretch horiz stretch horiz stretch horiz normal open normal normal normal normal normal normal 42 Unicode Technical Note 28 .

Unicode Nearly Plain Text Encoding of Mathematics longrightarrow mapsto matrix medsp mid models mp mu nabla naryand nbsp ndiv ne nearrow neq ni norm nu nwarrow odot of oiiint oiint oint Omega omega ominus open oplus oslash otimes over overbar overbrace overparen ⟶ U+27F6 ↦ U+21A6 ■ U+25A0 U+205F ∣ U+2223 ⊨ U+22A8 ∓ U+2213 Μ U+03BC ∇ U+2207 ▒ U+2592 U+00A0 ⊘ U+2298 ≠ U+2260 ↗ U+2197 ≠ U+2260 ∋ U+220B ‖ U+2016 ν U+03BD ↖ U+2196 ⊙ U+2299 ▒ U+2592 relational relational ordinary Ordinary relational relational unary/binary ordinary unary ordinary skip binary relational relational relational relational ordinary ordinary relational binary normal stretch horiz encl matrix normal list delims stretch horz unary/binary operand operand normal normal divide normal normal normal normal open/close operand normal normal normal nary nary nary operand operand normal open normal normal normal divide encl overbar stretch over stretch over ordinary ∰ U+2230 ordinary ∯ U+222F ordinary ∮ U+222E ordinary Ω U+03A9 ordinary ω U+03C9 ordinary ⊖ U+2296 binary ├ U+251C ordinary ⊕ U+2295 binary ⊘ U+2298 binary ⊗ U+2297 binary / U+002F ¯ U+00AF ⏞ U+23DE ⏜ U+23DC binarynsp ordinary ordinary ordinary Unicode Technical Note 28 43 .

Unicode Nearly Plain Text Encoding of Mathematics parallel partial phantom Phi phi Pi pi pm pppprime ppprime pprime prec preceq prime prod propto Psi psi qdrt rangle ratio rbrace rbrack rceil rddots Re rect rfloor rho Rightarrow rightarrow rightharpoondown rightharpoonup sdiv searrow ∥ U+2225 ∂ U+2202 ⟡ U+27E1 Φ U+03A6 ϕ U+03D5 Π U+03A0 π U+03C0 ± U+00B1 ⁗ U+2057 ‴ U+2034 ″ U+2033 ≺ U+227A ≼ U+227C ′ U+2032 ∏ U+220F ∝ U+221D Ψ U+03A8 Ψ U+03C8 √ U+221C U+27E9 ∶ U+2236 relational unary ordinary ordinary ordinary ordinary ordinary unary/binary ordinary ordinary ordinary relational relational ordinary ordinary relational ordinary ordinary open close normal operand encl phantom operand operand operand operand unary/binary Unisubsup Unisubsup Unisubsup normal normal Unisubsup nary normal operand operand encl root close normal close close close normal operand encl rect close operand stretch horiz stretch horiz stretch horiz stretch horiz divide normal relational } U+007D close ] U+005D close ⌉ U+2309 close ⋰ U+22F0 relational ℜ U+211C ordinary ▭ U+25AD ordinary ⌋ U+230B close ρ U+03C1 ordinary ⇒ U+21D2 relational → U+2192 relational ⇁ U+21C1 relational ⇀ U+21C0 relational ⁄ U+2044 ↙ U+2198 binarynsp relational 44 Unicode Technical Note 28 .

Unicode Nearly Plain Text Encoding of Mathematics setminus Sigma sigma sim simeq smash spadesuit sqcap sqcup sqrt sqsubseteq sqsuperseteq star subset subseteq succ succeq sum superset superseteq swarrow tau therefore Theta theta thicksp thinsp tilde times to top tvec underbar underbrace underparen ∖ U+2216 Σ U+03A3 Σ U+03C3 ∼ U+223C ≃ U+2243 ⬈ U+2B0D ♠U+2660 ⊓ U+2293 ⊔ U+2294 √ U+221A ⊑ U+2291 ⊒ U+2292 ⋆ U+22C6 ⊂ U+2282 ⊆ U+2286 ≻ U+227B ≽ U+227D ∑ U+2211 ⊃ U+2283 ⊇ U+2287 ↘ U+2199 Τ U+03C4 ∴ U+2234 Θ U+0398 θ U+03B8 U+2005 U+2006 ̃ U+0303 × U+00D7 → U+2192 ⊤ U+22A4 ⃡ U+20E1 ▁ U+2581 ⏟ U+23DF ⏝ U+23DD binary ordinary ordinary relational relational ordinary ordinary binary binary open relational relational binary relational relational relational relational ordinary relational relational relational ordinary relational ordinary ordinary skip skip ordinary binarynsp relational relational ordinary ordinary ordinary ordinary normal operand operand normal normal encl phantom normal normal normal encl root normal normal normal normal normal normal normal nary normal normal normal operand normal operand operand normal normal accent normal stretch horiz normal accent encl underbar stretch under stretch under Unicode Technical Note 28 45 .

Unicode Nearly Plain Text Encoding of Mathematics Uparrow uparrow Updownarrow updownarrow uplus Upsilon upsilon varepsilon varphi varpi varrho varsigma vartheta vbar vdash vdots vec vee Vert vert vphantom vthicksp wedge wp wr Xi xi zeta zwnj zwsp ⇑ U+21D1 ↑ U+2191 ⇕ U+21D5 ↕U+2195 ⊎ U+228E Υ U+03A5 υ U+03C5 ε U+03B5 Φ U+03C6 ϖ U+03D6 ϱ U+03F1 σ U+03C2 ϑ U+03D1 │ U+2502 ⊢ U+22A2 ⋮ U+22EE U+20D7 ∨ U+2228 ‖ U+2016 | U+007C ⇳ U+21F3 U+2004 ∧ U+2227 ℘ U+2118 ≀ U+2240 Ξ U+039E ξ U+03BE ζ U+03B6 U+200C U+200B relational relational relational relational binary ordinary ordinary ordinary ordinary ordinary ordinary ordinary ordinary ordinary relational relational ordinary binary ordinary ordinary relational skip binary ordinary binary ordinary ordinary ordinary ordinary ordinary normal normal normal normal normal operand operand operand operand operand operand operand operand list delims stretch horz normal accent normal open/close open/close encl phantom normal normal operand normal operand operand operand normal normal 46 Unicode Technical Note 28 .

7.unicode.0. Donald E. San Jose. Murray Sargent III and Angel L. Editing and Display of Mathematics using Unicode. Murray Sargent III. Barbara Beeton. Diaz. Unicode Support for Mathematics. 26th Internationalization and Unicode Conference. Sept (2000). in his Introduction to Tractatus Logico-Philosophicus by Lugwig Wittgenstein. (Reading.w3. Routledge and Kegan Paul.Unicode Nearly Plain Text Encoding of Mathematics References 1. 8. Inc. San Jose. The TeXbook. Unicode. and Mathematics. Unicode Nearly Plain-Text Encoding of Mathematics. Sept (1995). Elements of Quantum Optics. Unicode Technical Note 28 47 .kfs. Sept (2002).org/versions/Unicode4. Unicode Technical Report #o r e 2d t m 5e f a “ u Mi U p as n p t” i o h. 29th Internationalization and Unicode Conference. California. Asmus Freytag. March (2000). Holland. Unicode Plain Text Encoding of Mathematics. Massachusetts: Addison-Wesley 1984) 5. San Jose.org/reports/tr25 3. California. 17th International Unicode Conference. Scroll Systems.0.unicode. London 1922 (also currently available at http://www. c Sr t o c http://www. MathML and Unicode. Addison-Wesley. San Jose. Murray Sargent III. Some of these ideas were discussed in the following presentations: M. PS Technical Word Processor. the linear format is used for keyboard entry of mathematical expressions in Microsoft Word 2007 and the Microsoft Math Calculator. Bertrand Russell. LaTeX: A Document Preparation System. Sargent III (1991). Meystre and M. Rich Text. 15th International Unicode Conference. March (2006). California. Murray Sargent III. 9. MA. 16th International Unicode Conference. Leslie Lamport. 2nd edition (Addison-Wesley. (Reading. San Jose. This WP used a nonUnicode version of the plain-text math notation. 7th International Unicode Conference. Sept (2004). Knuth. (1989).org/~jonathan/witt/tlph. Version 4. User’ s Guide & Reference Manual. San Francisco.0/ 2. P. ISBN 0321-18578-1) or online as http://www. Sargent III. The Unicode Standard.html). California. California. Murray Sargent III. Sept (1999). Unicode Support for Mathematics. 1994.0 (Second Edition) http://www.org/TR/2003/REC-MathML2-20031021/ . 6. For example. Murray Sargent III. 22nd International Unicode Conference. This document was prepared using Microsoft Word 2007 with Cambria and Cambria Math fonts. 2003. Amsterdam. California. Murray Sargent III. Mathematical Markup Language (MathML) Version 2. SpringerVerlag 10. ISBN 1-201-52983-1) 4.

Sign up to vote on this title

UsefulNot useful- Grade 5 Sunshine Math
- Math Operation
- IEEExplore Compatible Template
- Template for Paper BJRBE
- Template Ubl Icel2013
- Wild Card
- IICBEE Template Conference or Journal
- Paredit Outline
- 2014_04_msw_a4_format.doc
- IEEE Format
- Template
- Paper Format
- AAST Template
- EBCDIC
- ASEENE14 Student Paper Template1-TS
- Brackets
- Sets
- Math Errors
- 11.2 11.3 Eng101SP15 Argument Readings Discussion OtherPunctuation
- An Introduction to Programming in Emacs Lisp, Ed 2.05 (R. J. Chassell) [FREE SOF. FOUND., Jan 5 2001]
- emacs-lisp-intro.
- punctuation-marks.pdf
- ebae2002
- Research Paper Template
- WHO Style-guide Medical Translation
- Eng 101sC Ellipsis Brackets Slash Hyphens
- Windows Speech Recognition Commands
- ScientificWordLectures.pdf
- null
- A Quick and Dirty Guide to Latex
- UTN28-PlainTextMath-v2

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd