You are on page 1of 4

9/27/13

Why PDF reading order is irrelevant to accessibility | Talking PDF

T he Place for PDF Information

Each PDF Page Is a Painting


Pos ted on November 8, 2010 by Duff Johns on in T alking PDF, Us er A cces s ibility

Why PDF reading order is irrelevant to accessibility Introduction


T h is a r t icle a t t em pt s t o ex pla in t h e con cept of r ea din g or der in PDF files. W h y is t h is n ecessa r y ? 1 . En d u ser s a r e oft en fr u st r a t ed by in con sist en t a n d oft en illeg ible r esu lt s w h en a t t em pt in g t o r ea d PDF files on m obile dev ices, sea r ch for PDF con t en t on lin e, or w h en u sin g a ssist iv e t ech n olog y (A T ) t o r ea d. 2 . Con t en t a u t h or s a n d m a n a g er s t a sk ed w it h en su r in g a ccessibilit y or Sect ion 5 0 8 com plia n ce in PDF docu m en t s oft en focu s on object s r a t h er t h a n t a g s, t h u s m issin g t h e m a r k . 3 . Soft w a r e dev eloper s a r e (u n der st a n da bly ) con fu sed by r ea din g or der a s pr esen t ed in t h e cu r r en t PDF Refer en ce (ISO 3 2 0 0 0 ), t h e t ech n ica l descr ipt ion of PDF. Ma n y h a v e com e t o u se t h e t er m r ea din g or der a s fu n ct ion a lly sy n on y m ou s w it h t h e log ica l or der im posed by t a g s, bu t t h is in t er pr et a t ion is in cor r ect . Iv e t r ied t o m a k e t h is a r t icle com pr eh en sible a n d u sefu l t o a ll; y ou ll be t h e ju dg e of m y su ccess. A t ech n ica l a n n ex is in clu ded a t t h e en d for t h ose w h o w a n t t o see w h a t r ea din g or der r ea lly m ea n s in PDF. Feel fr ee t o ch eck m y cr eden t ia ls a t t h e en d of t h e a r t icle.

The PDF Paintbrush


W h en y ou cr ea t e a PDF, y ou r e pa in t in g a pict u r e. Y ou r pa in t br u sh is t h e is t h e r esu lt of a com bin a t ion of t h e soft w a r e u sed t o cr ea t e t h e sou r ce docu m en t a n d t h e soft w a r e y ou v e ch osen t o con v er t y ou r sou r ce docu m en t in t o t h e u n iv er sa l elect r on ic docu m en t for m a t w e a ll k n ow a s PDF. Lik e t h e pa in t er s br u sh st r ok es, ea ch ch a r a ct er , ea ch lin e a n d ea ch im a g e is fu n da m en t a lly in depen den t , bu t t h ey ca n in t er a ct w it h ea ch ot h er t o pr odu ce pa r t icu la r v isu a l effect s. On t h e PDF pa g e, object s a r e con n ect ed by a coor din a t e sy st em a n d n ot a lot else. T h er es n o log ica l, sem a n t ic con n ect ion bet w een t h e let t er s com pr isin g a w or d; ch a r a ct er s sim ply h a ppen a t a ser ies of loca t ion s on t h e r en der ed pa g e. A s or ig in a lly desig n ed, PDF is fu n da m en t a lly a sy st em for pa in t in g object s on t o a pa g e, plu s a w h ole lot of ot h er fea t u r es w e a r en t t a lk in g a bou t r ig h t n ow ! T h er es n o in n a t e con cept of w or ds, sen t en ces, pa r a g r a ph s, colu m n s, h ea din g s, im a g es, t a bles, list s, foot n ot es a n y of t h e sem a n t ic st r u ct u r es t h a t dist in g u ish a docu m en t fr om a m ea n in g less h ea p of let t er s, sh a pes a n d color s. PDF is fu n da m en t a lly a bou t h ow t h e docu m en t a ppea r s on t h e pa g e, n ot h ow it look s w h en a bst r a ct ed fr om t h e pa g e. W h en a PDF in clu des in st r u ct ion s t o pa in t m or e t h a n on e object in t h e sa m e spot (it h a ppen s a ll t h e
talkingpdf.org/each-pdf-page-is-a-painting/ 1/4

9/27/13

Why PDF reading order is irrelevant to accessibility | Talking PDF

t im e), t h e it em s st a ck on t op of ea ch ot h er , w it h t h e la st it em pa in t ed a ppea r in g on t h e t op of t h e st a ck . Un lik e w a t er color s, ea ch br u sh st r ok e on ly a ppea r s t o blen d w it h t h e ot h er s if on e or m or e of t h em is sem i-t r a n spa r en t . A n ot h er ex a m ple: A PDF cr ea t or m a y ch oose t o pa in t a ll t h e T im es-Rom a n t ex t on t h e pa g e fir st , t h en com e ba ck a n d pa in t t h e t ex t t h a t a ppea r s in ot h er fon t s. Sin ce it s a pa in t in g , t h e or der doesn t r ea lly m a t t er a n y m or e t h a n it m a t t er s w h et h er Mon et pa in t ed h is w a t er lilies fr om left -t o-r ig h t or fr om r ig h t -t o-left , or fr om t h e in side-ou t , for t h a t m a t t er . If w e t h in k t h a t t h ese object s h a v e m ea n in g , t h a t s beca u se w e im pose sem a n t ics on t h e object s a s w e r ea d. If y ou en cou n t er a w or d t h a t st a r t s a t t h e en d of on e colu m n a n d en ds a t t h e t op of t h e n ex t , y ou r m in d st it ch es t h e t w o t og et h er w it h ou t con sciou s t h ou g h t . Lik ew ise, if y ou see a lin e of 1 6 poin t t ex t follow ed by a pa r a g r a ph of 1 2 poin t t ex t , y ou n a t u r a lly a ssu m e t h e 1 6 poin t t ex t w a s a h ea din g . Ok , it s a ll v er y w ell t o pa in t a pict u r e bu t w h a t if w e w a n t t o copy a n d pa st e t h e t ex t , or r eflow it for displa y on a m obile ph on e? W h a t if t h e con su m er is a ct u a lly a sea r ch -en g in e t r y in g t o in dex t h e docu m en t ? W h a t if t h e u ser is blin d or ot h er w ise disa bled, a n d r equ ir es specia l A ssist iv e T ech n olog y (A T ) dev ices t o r ea d a n d t o oper a t e t h e com pu t er ?

Universal Accessibility
W h a t does it m ea n t o sa y t h a t a n elect r on ic docu m en t is a ccessible ? If a docu m en t s con t en t s a r e st r u ct u r ed a n d or g a n ized su ch t h a t t h e m ea n in g of t h e docu m en t is a v a ila ble t o ev er y con su m er , t h en w e ca n sa y t h a t t h e docu m en t is a ccessible. It s n ot a bou t file for m a t . W or d, HT ML, PDF, Ex cel, Fla sh t h ey a ll h a v e ca pa bilit ies a n d lim it a t ion s a s filefor m a t s for elect r on ic docu m en t s. In m ost ca ses, ea ch for m a t ca n be m a de a ccessible, bu t it n ev er h a ppen s by a cciden t . A ccessibilit y r equ ir es in t en t ion , a n d t h e difficu lt y of a ch iev in g r ea l a ccessibilit y t en ds t o v a r y a s a fu n ct ion of t h e com plex it y of t h e con t en t .

HT ML is Different, not Better


In con v en t ion a l HT ML, r ea din g or der a n d log ica l or der a r e in h er en t ly a lig n ed. HT ML t a g s ca r r y a ll t h e sem a n t ic in for m a t ion (<P>, <H1 >, <H2 > et c). If t h e g oa l is a ccessibilit y , w h a t m or e cou ld y ou w a n t , r ig h t ? HT ML (especia lly w it h CSS) h a s it s ow n a ccessibilit y ch a llen g es, bu t a t t h e en d of t h e da y , HT ML is ju st t ex t . PDF, a t lea st t ech n ica lly , is n ot n ea r ly so ea sy . On t h e ot h er h a n d, t a g g ed PDF is a n a ccessible v eh icle for ju st a bou t a n y docu m en t , r eg a r dless of sou r ce. If y ou ca n pr in t it , y ou ca n m a k e a PDF. Pr et t y m u ch a n y PDF m a y be t a g g ed t o becom e a n a ccessible PDF. T h a t s h a r d t o bea t .

In t h e PDF for m a t , a ccessibilit y is a ssu r ed by a ddin g t a g s m a r k er s t h a t iden t ify t h e cor r ect or der of object s a n d t h e sem a n t ics of t h e docu m en t . T a g s st r on g ly r esem ble t h e HT ML t a g s on w h ich t h ey w er e m odeled. W h a t s t h e cor r ect or der ? T h er e m a y be m or e t h a n on e; a ft er a ll, t h er es n o cor r ect w a y t o r ea d a n ew spa per . T h e idea of cor r ect or der is sim ply t h a t w h ich ev er or der t h e a u t h or select s for t h eir PDF, it m u st m a k e sen se. It s n ot OK, for ex a m ple, t o m ix t w o sepa r a t e a r t icles t og et h er sim ply beca u se t h e colu m n s of t ex t a r e a dja cen t bu t it s per fect ly leg it im a t e t o do so in t h e r ea din g or der (a s t h e ex a m ple in t h e t ech n ica l a n n ex m a k es clea r ).

Conclusion
PDF t a g s a n d PDF t a g s a l on e defin e t h e log ica l or der of t h e docu m en t s con t en t , a n d t h u s, it s a ccessibilit y . T o t h e ex t en t a PDF is t a g g ed, it m ig h t be a ccessible. T o det er m in e w h et h er it is, in fa ct , a ccessible, t h e t a g s n eed t o be ch eck ed, a n d if n ecessa r y , cor r ect ed t o en su r e cor r ect log ica l or der a n d u sa g e.

talkingpdf.org/each-pdf-page-is-a-painting/

2/4

9/27/13

Why PDF reading order is irrelevant to accessibility | Talking PDF

User s seek in g t o en su r e t h eir PDFs a r e a ccessible sh ou ld focu s on t h e t a g s. T h e r ea din g or der of t h e con t en t on t h e PDF pa g e ju st isn t a fa ct or in a ccessibilit y , a s w e dem on st r a t e below .

Technical Annex: What Reading Order in PDF really means


T h e t er m r ea din g or der m ig h t lea d on e t o t h in k t h a t it is r elev a n t t o a ccessibilit y , bu t it s n ot , n ot w it h st a n din g t h e con fu sin g r epr esen t a t ion of t h e issu e in ISO 3 2 0 0 0 -1 :2 0 0 8 , Sect ion 1 4 .8 .2 .3 . In PDF, r ea din g or der r efer s sim ply t o t h e or der in w h ich t h e com pu t er r ea ds t h e file. It h a s n ot h in g w h a t soev er t o do w it h log ica l or der , t h e sequ en ce people u se, w h ich is defin ed in PDF by t a g s. Sect ion 1 4 .8 .2 .3 w ill be m odified in a n ew pa r t of ISO 3 2 0 0 0 t o clea r u p t h is con fu sion ov er t h e sig n ifica n ce of r ea din g or der w h en r e-u sin g PDF pa g e con t en t for a ccessibilit y or ot h er pu r poses. Y ou ca n bu y a n officia l copy of ISO 3 2 0 0 0 -1 :2 0 0 8 dir ect ly fr om ISO, or dow n loa d a n a u t h or ized copy for fr ee fr om A dobe Sy st em s.

Demonstration
PDF is ca pa ble of ex t r a or din a r y com plex it y , soph ist ica t ion a n d a ccu r a cy in r en der in g con t en t . Fr om t y pog r a ph y t o t r a n spa r en cies, fr om a lph a ch a n n el t o z-or der , t h e r a n g e of possibilit ies in g en er a t in g t h e files r ea din g or der is effect iv ely in fin it e, ev en for t h e sa m e con t en t ! T h e follow in g im a g e r epr esen t s a n ex a m ple of con t en t a s r en der ed on a PDF pa g e. Sim ple t h ou g h it is, t h is ex a m ple n on et h eless dem on st r a t es h ow r ea din g or der a n d log ica l or der a r e u t t er ly dist in ct in a PDF file.

W h a t follow s is on e possible ex a m ple of a ct u a l PDF code for t h e a bov e t ex t . T h is code h a s been dr a m a t ica lly sim plified t o m a k e t h in g s a s clea r a s possible. T h e em ph a sis in dica t es t h e r en der ed t ex t (see t h e im a g e a bov e) a s it occu r s in t h e PDFs r ea din g or der . q 1 0 0 -1 0 4 3 2 cm 0g0G BT 1 4 0 0 -1 4 7 2 8 4 T m /F1 .0 1 T f ( T h e qu i ck ) T j 1 4 0 0 -1 4 1 4 7 .6 8 4 T m ( t h e l a zy ) T j 1 4 0 0 -1 4 7 2 1 0 0 T m ( br own fox ) T j 1 4 0 0 -1 4 1 4 7 .6 1 0 0 T m ( dog. ) T j 1 4 0 0 -1 4 7 2 1 1 6 T m ( ju m ps ov er ) T j ET Q Of cou r se, t h e r ea din g or der is t h is ca se is sem a n t ica lly in cor r ect , beca u se t h e PDF cr ea t ion soft w a r e pa in t ed ea ch lin e of t ex t a cr oss t h e pa g e, cr ossin g t h e colu m n s a s it did so. Non et h eless, t h is ex a m ple is 1 0 0 % leg it im a t e PDF, a s per ISO 3 2 0 0 0 -1 :2 0 0 8 . If t h e ex a m ple code g iv en a bov e in clu ded con t a in er in for m a t ion (n ot in clu ded t o m a k e t h e ex a m ple
talkingpdf.org/each-pdf-page-is-a-painting/ 3/4

9/27/13

Why PDF reading order is irrelevant to accessibility | Talking PDF

m or e r ea da ble t o n on -dev eloper s) a n d t a g s, it w ou ld con for m t o t h e for t h com in g ISO 1 4 2 8 9 -1 (PDF/Un iv er sa l A ccessibilit y ), ev en t h ou g h t h e r ea din g or der m a k es n o sen se. If a PDF v iew er ca n n ot con su m e t a g s, y ou ll g et y ou r t ex t in t h e a bov e or der . T h a t s NTDE (Not T h e Desir ed Effect ), a s w e lik e t o sa y . If t h e PDF is cor r ect ly t a g g ed a n d t h e v iew in g soft w a r e su ppor t s t a g s for con t en t ex t r a ct ion a n d r eu se, t h e t ex t w ou ld a ppea r in cor r ect log ica l or der a n d w it h a ppr opr ia t e sem a n t ics (in t h is ca se, a sim ple pa r a g r a ph ) a s follow s: <p>T h e qu ick br ow n fox ju m ps ov er t h e la zy dog .</p> A n d t h a t s w h y w e ca n sa fely a n d r espon sibly ig n or e r ea din g or der w h en con sider in g a ccessibilit y in PDF. If y ou a r e u n h a ppy w it h y ou r r esu lt s ex t r a ct in g con t en t for r eu se, u sin g a ssist iv e t ech n olog y , or ot h er w ise con su m in g PDFs, be su r e y ou r soft w a r e su ppor t s t a g g ed PDF.

Key T akeAways
1 . A PDF is a ccessible w it h ou t r efer en ce t o it s r ea din g or der , bu t by r efer en ce t o t h e t a g s. 2 . If t h e PDF h a s n o t a g s, or t h e t a g s a r e in cor r ect , t h a t PDF is n ot a ccessible or r elia bly r eu sa ble. 3 . If t h e cr ea t ion , v iew in g or ex t r a ct ion soft w a r e ca n n ot cr ea t e or u se PDF t a g s (a s a ppr opr ia t e), t h a t soft w a r e doesn t su ppor t a ccessible PDF.

Credentials
T h er es con sider a ble m isin for m a t ion r eg a r din g PDF a ccessibilit y . T h e r esu lt in g con fu sion is ev iden t on a n u m ber of g ov er n m en t w ebsit es a n d ev en in A dobes A cr oba t Pr ofession a l soft w a r e. A s elsew h er e in t h e a ccessibilit y w or ld, opin ion s a r e oft en st r on g ly h eld a n d fier cely defen ded. For t h is r ea son , it seem s lik e a g ood idea t o est a blish m y cr eden t ia ls for t h is discu ssion . ISO 1 4 2 8 9 -1 isn t pu blish ed y et , bu t I ca n t ell y ou n ow t h a t it doesn t ev en m en t ion PDF r ea din g or der . Im n ot offer in g a n opin ion h er e; t h ese a r e sim ply t h e fa ct s. Iv e been in t h e bu sin ess of m a k in g PDF files a ccessible sin ce A cr oba t 5 w a s r elea sed in 2 0 0 0 , lon g er t h a n a n y on e else ex cept t h e dev eloper s a t A dobe Sy st em s w h o cr ea t ed t h e t ech n olog y in t h e fir st pla ce. Sin ce 2 0 0 5 , Iv e ch a ir ed A IIMs PDF/UA (Un iv er sa l A ccessibilit y ) com m it t ee t h r ou g h scor es of m eet in g s a s w e dr a ft ed t h e for t h com in g In t er n a t ion a l St a n da r d for a ccessible PDF, w h ich is pla n n ed for pu blica t ion in 2 0 1 1 a s ISO 1 4 2 8 9 -1 . Im a lso a lon g -t im e edu ca t or on PDF a ccessibilit y in a r t icles, blog post s a n d sem in a r s a r ou n d t h e w or ld. (Lea r n m or e a bou t A pplig en t Docu m en t Solu t ion s effor t s on beh a lf of ISO st a n da r ds for PDF.) By Duff Johns on A ssist iv e T ech n olog y , Log ica l Or der , Rea din g Or der , Un iv er sa l A ccessibilit y

talkingpdf.org/each-pdf-page-is-a-painting/

4/4