You are on page 1of 30

An introduction to X

E
T
E
X
Jonathan Kew
SIL International
June 15, 2005
A
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
An introduction to X
E
T
E
X
What is X
E
T
E
X?
T
E
X typeseing engine
including e-T
E
X extensions
Supporting the Unicode charaer set
inherently multilingual/multiscript typeseuing system
greatly simplines language support at macro level
Using modern font technologies
TrueType, OpenType (all fonts supported by platform)
With smart rendering support
Apple Advanced Typography
OpenType Layout features
for typographic features and complex scripts
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Multilingual typeseing with T
E
X
Text input
escape sequences for non-ASCII characers
multiple 8-bit codepages
preprocessors for complex scripts
Font support
fonts limited to 256 glyphs
custom-encoded fonts with qecinc glyph sets
All tied together via complex T
E
X macros
dimcult to understand and extend
dimcult to integrate with other packages
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Traditional T
E
X input conventions
Input text is ASCII (or 8-bit codepage)
Source text Typeset output Notes
\'{a} typical accent command
\c{c}
\aa
--- ligature in typical T
E
X fonts
$\alpha$ math mode symbol
{\dnacchaa} " using custompreprocessor
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Towards a cleaner solution
Unicode: all required charaers directly represented
no need for escape sequences to access characers not
included in the current codepage
no need to switch between codepages according to the
language/script being typeset
characers rendered via standard access codes
Charaer/glyph model and modern font rendering
technologies
complex script handling moved out of the domain of the
text data stream
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Typeseing Unicode text with X
E
T
E
X
Accented charaers
\halign{#\hfil\quad&
#\hfil\cr
dan&dan\cr
dubok&dubok\cr
dabe&ak\cr
din&dabe\cr
Din&din\cr
ak&Din\cr
Evropa&Evropa\cr}
dan dan
dubok dubok
dabe ak
din dabe
Din din
ak Din
Evropa Evropa
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Typeseing Unicode text with X
E
T
E
X
CJK ideographs
\font\han="STSong"at16pt
\font\rom="Gentium"at8pt
\def\hc#1#2{\vtop{\hbox{\han#1}
\hbox{\kern10pt\rom#2}}}
\vtop{\hc{}{ka-ku}
\hc{}{motto-mo}
\hc{}{sai-go}
\hc{}{hatara-ku}
\hc{}{umi}}

ka-ku

motto-mo

sai-go

hatara-ku

umi
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Typeseing Unicode text with X
E
T
E
X
Complex scripts
\c1
\s
\p
\v1
.
\v2
.

\v3
..

!" $%& *+, ./ $245

7+848!: *+, ;<

.>+? .+@
AB.C 4D$E, >F GH%& !IC .!JB 4K
!F ./ $E, !F

MN$@ >B O+? $&
>C RST ./ *BU8

!? !J@ 4+W
.!J+@ !Z !H5 >& .!JZ !H5 A8
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Charaer codes
Basic charaer codes are 16-bit
representing Unicode in the UTF-16 encoding form
(except when using legacy custom-encoded fonts)
Extended T
E
X primitives
\char, \chardef accept numbers up to 65,536
4-digit hex notation using ^^^^abcd
\char"5609^^^^6167 =
What about Unicode charaers beyond Plane 0?
handled using surrogates (standard UTF-16)
adequate for rendering
does not allowfull per-characer programmability
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Extended T
E
X code tables
Per-charaer code tables \catcode, \lccode, \uccode,
\sfcode enlarged
plain X
E
T
E
X format initializes these tables based on
Unicode characer set
\lowercase{DIN}
din
\uppercase{Esieyamaklmaenuvwoavla}
ESI EYAMA KL MAE NUVWO A V LA
\catcode`\=\active\def{...}
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Input encodings
By default, input read as Unicode (UTF-8 or
UTF-16)
encoding formautomatically detected
Non-Unicode input text
legacy codepages supported via ICUconverters
set codepage of current input nle:
\XeTeXinputencoding"charset-name"
set initial codepage for newly-opened input nles:
\XeTeXdefaultencoding"charset-name"
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Unicode support
Hyphenation paerns
Extended for 16-bit charaers
Standard hyphenation les are encoding-ecic
modined to load correctly under X
E
T
E
X
Simple hyphenation for scripts such as Devanagari
text is simple characer data, no macros, acive chars, etc.
%breakbeforeorafteranyindependentvowel
11
11
11
%breakafteranydependentvowel,butneverbefore
21
21
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Font support in X
E
T
E
X
Host platform fonts
Use any font installed on the host computer
\font command extended to accept real font names
\font\rm="TrebuchetMS"at16pt\rmHelloWorld!
Hello World!
\font\it="TimesItalic"at16pt\itHelloWorld!
Hello World!
\font\ch="AppleChancery"at16pt\chHelloWorld!
Helo World!
No TFMles required!
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Font support in X
E
T
E
X
Output device support
Output driver inherently has access to the same
fonts as the typeseing engine
Generate PDF as default output
there is actually an extended DVI (.xdv) intermediate
Fonts automatically embedded and subseed
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Font support in X
E
T
E
X
Support for traditional T
E
X fonts
TFMles still supported
required for math fonts
implies non-Unicode data, using characer codes 0255
only
PDF back-end supports Type 1 fonts
uses .pfb nles in the texmf tree, just like dvips
no support for bitmap fonts
currently no .vf support
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Font support in X
E
T
E
X
Font mappings
Traditional T
E
X keyboarding praices
typical input:
``\TeX''---atypesettingsystem
generates: ``T
E
X''---a typeseuing system
Font mapping for compatibility
;TECkitmappingforTeXinputconventions
U+002DU+002D<>U+2013;--->endash
U+002DU+002DU+002D<>U+2014;---->emdash
U+0027<>U+2019;'->rightsinglequote
U+0027U+0027<>U+201D;''->rightdoublequote
U+0022>U+201D;"->rightdoublequote
generates: T
E
Xa typeseuing system
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Font support in X
E
T
E
X
More fun with font mappings
\def\SampleText{Unicode-

,\\
,\\
,\\
.}
\font\gen="Gentium"
\gen\SampleText
\bigskip
\font\gentrans="Gentium:
mapping=cyr-lat-iso9"
\gentrans\SampleText
Unicode -
,
,
,
.
Unicode - to unikal'nyj
kod dl lbogo simvola,
nezavisimo ot platformy,
nezavisimo ot programmy,
nezavisimo ot zyka.
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Typographic features
AAT font features
Custom AAT features accessed via \font command
\font\x="AppleChancery"at16pt\xThequickbrown
foxjumpsoverthelazydog.
Te quick brown fox jumps over te lazy dog.
\font\x="AppleChancery:LetterCase=SmallCaps;Design
Complexity=SimpleDesignLevel"at16pt\xThequick
THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG.
\font\x="AppleChancery:DesignComplexity=Flourishes
SetA"at16pt\xThequickbrownfoxjumpsover
Te Quick brown fox jumps over te lazy dog.
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Typographic features
OpenType: language and script
Fonts may support multiple languages with diering
behavior
\font\Doulos="DoulosSIL/ICU"
\font\DoulosViet="DoulosSIL/ICU:language=VIT"
Unicode cung cp
mt con s duy
nht cho mi k t
Unicode cung c'p
mt con s( duy
nh't cho m)i k t
\font\Brioso="BriosoPro"
\font\BriosoTrk="BriosoPro:language=TRK"
gelen rmalar
tarafndan
gelen firmalar
tarafndan
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Typographic features
OpenType: language and script
Complex Asian scripts require ecic shaping
engines
\font\x="Code2000:script=arab"\x

\font\x="Code2000:script=deva"\x

2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Typographic features
OpenType: optional features
Font ecication may include feature tags
\font\x="BriosoPro"\xHelloWorld!0123456789
Hello World! 0123456789
\font\x="BriosoPro:+smcp"
Hiiic Wcvii: e..,,_.-s|
\font\x="BriosoPro:+sups"
H'' W'! '''`
\font\x="BriosoProItalic:+onum"
He!o World! ...,,,.;|
\font\x="BriosoProItalic:+swsh,+zero"
+e.o or/`! .123456789
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Typographic features
OpenType: optical sizing
OpenType optical families automatically choose
correct face for the size used
Brioso Pro at 7, 10, 18, 24pt sizes:
seven ten eighteen twenty-four
Can override with /S= modier on font name
BriosoPro/S=7 Brioso Pro Caption
BriosoPro/S=10 Brioso Pro Text
BriosoPro/S=18 Brioso Pro Subhead
BriosoPro/S=24 Brioso Pro Display
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Asian-language linebreaking
Line-break positions
Line breaking without word spaces
T
E
X normally breaks lines at glue arising fromspaces
Chinese, Japanese, 1ai, etc. do not use word spaces
, , 5 , . 5
, F . F , Unicode , ,
, encoding F , F .
Use ICU line-break: \XeTeXlinebreaklocale"th"
, , 5 , . 5
, F .
F, Unicode , , , encoding F ,
F .
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Asian-language linebreaking
Justication
Text without spaces is dicult to justify, as well as to
line-break
Ragged-right seing is one solution

Alternatively, use \XeTeXlinebreakskip to introduce


glue at each potential break

2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Built-in graphics support
QuickTime image support
Many graphic le formats directly supported
TIFF, JPEG, PNG, PCX, BMP, PICT, GIF, TGA,
Photoshop,
\setbox0=\hbox{\XeTeXpicfile"mypic.jpg"}
Optional keywords to modify image
scaled, xscaled, yscaled, width, height, rotated
Image width and height available to T
E
X engine
Can use via L
A
T
E
X and ConT
E
Xt commands
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
Built-in graphics support
PDF documents
Beware: QuickTime graphic importer accepts PDF
but renders as raster image at screen resolution!
Use alternative command for true PDF inclusion
\XeTeXpdffile"xetex-intro-slides.pdf"page1
scaled400
An introduction to X
E
T
E
X
Jonathan Kew
SIL International
June 15, 2005
A
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
L
A
T
E
X and ConT
E
Xt support
fontec.sty by Will Robertson
Simple ecication of native OS X fonts in L
A
T
E
X
Integrates X
E
T
E
X font access with L
A
T
E
X commands
seuing the default document fonts
\usepackage{fontspec}
\setromanfont{AdobeGaramondPro}
\setmonofont[Scale=0.8]{AndaleMono}
on-the-ny font and feature changes
27{\addfontfeature{VerticalPosition=Superior}th}
27
Welcometo{\addfontfeature{LetterCase=SmallCaps}
Practical\TeX}
Welcome to Puc::cI T
E
X
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
L
A
T
E
X and ConT
E
Xt support
xunicode.sty by Ross Moore
Support for standard L
A
T
E
X input of many ecial
charaers when using Unicode fonts
accent commands, named characers, etc., mapped to
Unicode values for font access
does not handle dashes, quotes (use tex-text font
mapping)
Allows many non-Unicode L
A
T
E
X documents to be
processed using Unicode fonts
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
L
A
T
E
X and ConT
E
Xt support
Using ConT
E
Xt with X
E
T
E
X
Reportedly works fairly readily, but not
pre-congured out of the box
see http://www.contextgarden.net/XeTeX
Use X
E
T
E
X font names and features in ConT
E
Xt
typescripts and other font denitions
see http://www.contextgarden.net/Fonts_in_XeTeX
\definedfont["HoeflerText:
mapping=tex-text;
StyleOptions=EngravedText;
LetterCase=AllCapitals;
color=229966"at24pt]
BigTitle
BIG TITL&
2 Praical T
E
X Conference Chapel Hill, NC, June 2005
For more information
Questionsand answers?
Contact information
mailto:jonathan_kew@sil.org
X
E
T
E
X web site and mailing list
http://scripts.sil.org/xetex
http://tug.org/mailman/listinfo/xetex
A
? "" Unicode
(/)? to je Unicode? ?
Unicode; ? ? Hva er Unicode?
?
Unicode? Unicode ? ?
svn://scripts.sil.org/xetex/TRUNK
2 Praical T
E
X Conference Chapel Hill, NC, June 2005

You might also like