You are on page 1of 158
TREE AUTOMATA AND TREE GRAMMARS) TREE AUTOMATA AND TREE GRAMMARS by Joost Engeliriet Warning: These Lecture Notes were Written in the Fate of 1g7y 4nd have, Since , not been DAIMI FN-10 Updated ! April 1975 Institute of Mathematies University of Aarhus DEPARTMENT OF COMPUTER SCIENCE Ny Munkegade - 8000 Aarhus C - Denmark Phone 06-1283 55 Tree automata. and tree grammars Te appreciate the theory of tree automata and tree grammars one should abreadg be motivated by the goals and results of formal Language theory. Tr particular ove Showld be interested in “derivation trees”. A derivation tree models the grammatical stracture of a Sentence. in a (context-free) Language By considering only the bottom of the tree the Sentence be recovereA from tho tree. The ferst idea in tree language theory 's to generalize the notion of a finte automaton working hn strings to that of a finite automaton operating on trees. Tt turns out that a Large part of the theory of requtar Languages Caw rather easily be genoralized to @ theong of regular tree Languages. Moreover, Sina a regulon tree Lenguage is (almost) the Same as the set of darivation trees of Sone context-free Lorquage 1 ono. obtains results about context free Ronguages by “taking the bottom“ of results about Vegutar tree Languages. The Second idea in tree Language theorg 's to generalize the notion of a generalized Sequential machine (that is, finite automaton with output) to that of « finite State tree transducer. Tree trans - ducers are More complicated than String transducers Since thoy are equipped with the basic capabibities of copying , deleting and reordering (OF Sub- trees). The part of (tree) Cnguag theory that is concerned is with transkation of Languages 1s Mainly motivated by Compiler writing Gud , to a lesser extent, 4, naturel Pinguistics), When considering bottoms of trees, fircte State transducers are essentially the Same as Syntax— dipectea transkation schemes. Results in ths part of tree Language theory treat the Composition and decom— position of tree transformations , aud the properties of those tree Languages that can be obtained by fincte State transformation of regular tre Languages (OF taking bottoms , those Languages that con be obtained by Syrtox- directed translation of Contert- free LAnguages ) Thirdly there ave , of Course, Mang other (deas sn tree Language theory Te the Aiterature One Can Find, for instance, Context- free tree grammars , recognition of subsets of arbitrary algebras, tree Walking wtomata , hierarchies of tree Cnguages (obtained by iterating old ideas) , ceconposition of tree automata, Lindennarger tree graummars , etc, These koctures Wilh be divided in the Following five parts: (4) and (2) Contain prelininaries , 3), (4) and (5) ave the Main parts. () Thtroduction . (p-3) (2) Some basic definitions. (p.6) (3) Recognizable (= legular) tree Languages. (p.20) (y) Finite state tree transformations. (p. bY) (5s) Whatever there i's more to Consicer Part (5) is not Contained in these notes ; instead, some Notes on the leberature are given on p. (92. 3 1. Introduction Our basic data type is the Kind of trea used to express the grammatical structure of. strings ina Context free Language . (1.4) Example. Consider the Context free. grasmarr G=(N,2,R,S) with nonterminals N=fS,A,D} , terminals S=fa,b,d} , initial ronterminal S' and the set of rules R, consisting of the rules S'-> AD, A—>aAb, A-sbAa, A+AA, A+2~, D+» Dad and D—+Aa (we use A to denote the empty string). The word baabddd € Z* can be generated by & and has the following derivation tree (see [Sat, I-67, (AGU, 0.5 and 2-4-1): oe Y™ JN A A p dd Ae oP b d . e — Note that we use @ as a symbol standing for the empty Word A . — The string baabddd is called the "Yield" or “resugk" of the derivation tree, M Thus, in graph terminology, our trees are fintte (finite number of nodes and branches) , directed (the branches are "growing downwards ), god y rooted (there is one node, the root, with no branches entering it), ordered ( the branches Leaving a node ave Ordered frou Left to right ) and Labeled ( the nodes are labeled with Symbols from Some alphabet ). The following intuctive terminology will be used : the rank (or out-degree) of a node is the number of branches Leaving i (note that the in-degree of Q pode is always 7, except for the Poot tohich has in-degree o ) a heat is a node with rank 0 the top of a tree Is us root the bottom (or frontier) of a tree is the Set (or Sequence) of ws Leaves the ylekd (or result, or frontier) of a tree is the String obtained by writing the Labels of Uts Coaves (except the Label e) from Left to right a path through a tree is & Sequence of nodes Connected by branches ("Leading downwards”) ; the Length of the path is the number of tts nodes weinus one (that is, the number of its branches) the height (or depth) of a tree is the Length of the Longest path from the top to tha bottom if there is @ path of kength a7 (of tergth = 1 frou rode a to node b then b is a descendant (durect descendant) of 2 and a is an ancestor (direct ancestor) of b — a Sub-tree of a tree is a tree determined by & node together with all us descendants » 2 direct Sub-tree is 4 Sub-tree determined by a fs] cuirect descendant of the root of the tree ; note that each tree is wntguely determined by the Lebel of ubs root and the (possibl, enpty) Sequence of Uts direct Sub_trees . — the phrases “bottom-up” | “bottom—to- top” and “frontier-to-root “ are used to indicate thir } ditection, while the phrases “top-down” " top-to-bottom ” ancl “root—to-frontier” are used to indicate that | direction. Tn derivation trees of context-free grammars each Sgpbok may onky Lahel nodes of certain ranks . For instance, in the above example, a,b,d ande may Ons Rabel Leaves (nodes of rank o), A Rabels nodes with ranks 1,2 ad 3, S habels nodes with rank 2 and D nodes of rank 4 aud 3 (these numbers being the dongths of the right hand Sides of rutes ). Therefore, given Some alphabet, ne require the Specification of a finite number of ranks for each Symbol in the alphabet , and we restrict attention to those trees in which nodes of rank & are kabeled be, Symbols of rank &, 2. Some basic definitions . The mathematical definition of a tree may be given in several , equivalent, ways. We will define a tree as a special Kind of string (othus cate this string a fepresentakion of the tree , See [ARU, 0.5.7] 2). Before doing So, Rok us define ranked alphabets (2.4) Definition. Ain alphabet EZ is Sard to be renked if for each nonnegative integer & a Subset Ex of ZF is Specified , such that Zp is nonempty fora finite number of k's onk, , and Such that Ze YZ ZR. Moreover , we shall require that , for att kat, EN aya PF (this is not an essential restriction), TE ae By, thin we Say that a has rant k (note that a may hare more than one rank) Usually we define a specific ranked alphabet Z by Specifying those Zp that ave nonempty. W t Made Into a ranked alphabet by Specifying SF, ={a,b}, 2, -i{-3 and Zy=f+,-}- (Think of negation ane Suotraction ) , M (2.2) Example The alphabet F{a,b, +, —} vs to be more precise one should dofine a ranked alphabet as a pair (Ef), where EF is an alphatet and f sa napping from WN into P(3) such that Jn Vkon: f= es, and ther denote Hk) by Se and (Ef) by =. : (2.3) Remark, Throughout our discussions we Shale use the Syimbot & as a special symbot , intuitively representing 2 Wherever e belongs to a ranked alphabet, it is of rank 0. (ff Operations on ranked alphabets Should be defined as for instance in the following ae finction . (2-4) Definition. Leb Z and LD be ranked alphabets The union of F and A, denoted by ZUA, ts olefined. by (Zuby=s Lz, forall kz0. We say that Sant A are equal , denoted by F=A, if, forall kro, Sade. a We now define the notion of tree. lee “[" and "J" be two Symbels which are never elements of Q ranked alphabet. (2:5) Definition. Given a ranked alphabet =, the Set of frees over Z , denoted by Te , Is the harguage over the alphabee ZUSL,]} defined inductively, as follows &) Tf aes, , thn aes. @ For ke1, if ae Sy ond §,t,,...%% € Te, then alti] € Te. 4 Lituitively , a is a tree with one node Labeled "a", anh alt, ty-..tg] fs the trea - : AA: A 8 (2-6) Exauaple . Consider tha ranked obphabet of. Example 22. Ther +[-[a-[b]]o] ‘so tree over this alphabet , intuitively “Yepresenting” the trea ae = Qa a i b tohich on tbs turn "represents" the expression (a—(-b))+a (note that the “etficial" tree is the prefix-notation of this expression ) . MW (2-F) Exanphe. Considor the ranked alphabet A , where A, = {a,b,d,eh, Ap= Os=44,D$ ont 4o- {4,8} - A picture of the tree S[ALALbALe]2] AfoAle]b}]D[Dl4I44}} ih Ta is given in Example 4.4. MW (2.8) Exercise. Take Some ranked alphabet = and Show that Te is a context-free Ranquage over ZuSLJ- Ul Our ain will be to study several ways of constructively representing sets of trees and relations between tres. the basic terminokogy (Ss the Following, (2-4) Dedinition. lek ZB be a raniad alphabet . A tree Language over S is any Subset of “Ts . U (2-40) Definition. lek & and J be ranted alphabets. A treo transtormation from Ts into Ta is ang Subset oe u ig. (2.11) Exercise.. Show that the Context- free grammar G=(N,E,R,S) with N={S}, S-fubL,]$ onc Raf SS blaS], Sah generates a tree Language over A, whore A,=fal and Ag= {bf WU The above definition of “tree” ( Definition 2-5) gives rise bo the following principles of proof by induction and definction by induction for trees, (Note that each tree is, Uniquely, ether in S or of the form alt,--.%] ). (2.12) Principhe of proof by induction (or recursion) on trees, let P bea property of trees (over =). EE i) all elements of FS have property P, anct (iy for each kz, and each ae Bm, Ff %, vote have property P, then alt,--. te] has property P, then all trees in T= have property P. MW (2.13) Principle 04 definition by thduction (or recursion) on trees . Suppose we want to associate a value ity with each tree t in Te. Thon ik Suffices to define A(a) for ath 2ES, , wel to Show how to compute the value Alalt,...te]) from the values At), --., Rte). More formaths expressed , Given a set O of objects, and (i) for each AES, aw object QO EF, ant Wi) for each €34 and each AE Se, a mapping fa: OF +6, there is exactly one mapping fs Te —» O such thet do & Rijs fralae B , wa (iy RlaLty-- te) = Sp (ROG), Rte) for eh Eat, ack ad 4 € TS UW (2.19) Example. let Zyafek and Z,-{ 1h. The trees in Te are in an obvious one-to-one Correspondence with the natured numbers. The ehove principles are the usual induction princiabes for these numbers . M To ifkustrate the use of the induction princeplas we give the following useful definitions. (2.15) Defindtion. The mapping Yield from Ts into SN is defined inductively as follows. @ ForraeZ, ie if ate A iface. (it) For ac Bg and %,--,% € TE, + yield (alt, tJ) = gield(ty) yield (fe) yieldCte). Moreover , for a tree Language Leo Te, we define yield CL) = 4 yielact) | tel}. We shall Sometimes abbreviate “yiela” by “y”. M (2.76) Definition. The wapping height from Te inte IV is defined recursively as follows. &) foraes, height(ay= 0 uty For ae Re mad G0 ETE, height (a.[6,+- te]) = wax (height (t:)) +7. ages€ +t That is, the concatenation of yieldCty), ---, Yield Cte) . oe As an example of « proof by we show that 7 @ ¢ Zo and & thar, for all te Ts, height (t) < | yieldct)| Proof. For ae me, height (ajzo ane |yield(a)|= ]al= 7 (Since axe). Now Lot ae Sy (#32) and Assume Cinduction hypothesis) that height (t;) < lyield(t;)| for sist Ther | yiet co [ty--- +e) | = 2 yield ct; >] (Def. 2.15 (ce) > (CS heightcte)) ce Cind. hypothesi's ) > (man height (4) +2 (Kz2 and height 20) ages > heigice (alt, te7) C Def, 2.16 (ey ). MW (2.77) Example. induction on trees =f, (2.18) Exercise. Define a (String) homomorphism # from (Zv{C,1})" imo 3 such that , forall te Ts, Rik) = yrelact) 4 (2.19) Exercise. Give a recursive definition of the notion of “sub—tree” subtree ; Te —» PUT ze) such that subtree t) is the Set of atl subtrees of t. definition of “Sub-tree” for instance as a mapping Give also an abternative in @ more String-Pike fashion, Hf (2.20) Exercise. Let path(t) denote tte set of ale paths from the top of t to us bottom . Think of & formal definction for “path”. Me The generalization of formal Cargquage theory to a formal tree Language theory wll Come thout bg Viewing @ Siting as a special Kind of trea "and taking the obvious generalizations. To be able to View 12 strings as trees we “turn them go degrees” to a vertical position, as follows. (2.21) Definition. A ranked alphabet Eis monadic 'f Ci) B=fe} , ond Uy) forkr2, B=. The elements of Ts ore called monadic trees MW Thus a monadic ranked alphebet = 1s fully determined by the akphabet Z,. Monadic trees obviousty Can be wade to correspond tp the strings in ZN There are two ways to do this, olepending on Whether we read top-down or bottom-up : fia Te = By 8 defined by O fis le)= oy Wi) fp (@[t])= a Fig) for 2€%, and teTs and fu, Te + B'S defined by O fy, le= 2 Ww fy, (Alt) = f4,().4 for eee, tee. (Obviousty both fi ana fy, are bijections). Accordingly , when generalizing & String—Concept *o trees, we often have the choice between a top-down and a bottom-up generalization. (2.22) Exauple. The String etphabet A-fa,b,c} corresponds to the monadic alphabet Z with Z,={e} and 2,- A. The tree & in Te Corresponds O=0~o | 13 either ty the String abch in A* (top-down), or to the string bcba in A* (bottom-up). Note that , due to our “prefix olefinetion” of trees (Definition 2.5), the above tre. looks "top-down like” in uts offrcial form a[ble[bfe]]]] - Obviously this is not essential. = / let us Consider Some basic operations on trees A basic gperation on Strings is Cight -~ concatenation with one Symbol (that is, for each Symbol a in the alphabet there is an operation rc, Such that, for each String W, "0, (W)= Wa), Every String Caw uniquely be buikd up from the empty String bg these basic operations (consider the Way you write aud read!). Generalizing bottom-up, the Corresponding basic operations on trees, here Called “top-concatenation ”, are the following - (2.23) Definétion. For each ae Eg (471) We define the operation of top concatenation with a by toy , to be the mapping from a that, for all t,,..,t% € Ts, ten (t-te) = alt, 7. Moreover, for tree Canguages by, | denoted into Tz Such wy ly , we define tig (by, uly) = alt, | te be forall reese}. Me Note that every tree Can uniquely be burkd up from the elements of ZB, by repeated top concatenation The noxt basic operation on strings i's concatenation. When viewed Mmonadreally , concatenation MM corresponds to Substituting one vertica] String in th e& of the otter vertical string. Tn the general Case, we take one tree and Substitute a tree into each heat of the original tree, such that different trees may be Substituted into leaves with Aifferent labels, Thus we obtain the following basic operation on trees. (2.24) Definition. Let n21, 4 yy An € 2, all afferent , and 54, ---45, € 7s . for teTz, the tree concatenation of t with $5 at Gris , denoted by t » WS defined recursively as follows. @ freed, AKA 5) 1 I O-Sn> = fe fee a otherwise Gi) for ae Fy and %,--,% € TE, alt RIS? = Olt <> + tes: >7 whee <---> abbreviates - Tf 10 particdar, n=1, then, for each ae = and tse, t ts also denoted by tys. WU (2.25) Example, Let O,={Xy,c} and Ae = {a,b}. Tf t= alblxy]xc] , then t = alb[ bl[ex]c] bfex]c]. ud (2.26) Exercise. Check that in the monadic Case tree concatenation Corresponds to String concatenation. // For tree languages tree concatenation is defined analogously . (2.23) Definition, let n24, 4,--,an € % all different , and Lb, ©. Pr lets we define the tree concatenation of L with L,,...,4, a6 %,--14n , denoted by Lael, ha ly>, as follows a W for aed, aca ly, ane la> = = tf a=ag : a otherwise Gi) for 263, and t,..,t2 € TS, | Olt, te] <> = ALG Sr ESD] Gv for LE Te, Letely, tee ly> = U tea e bye tee be? Tf, in particular, n=4, then, for each AEB, And each Ll, & Te, we denote L, «ho by Lyte. WA (2:28) Remarks. (1) Obviously , if LL,--.4, we singletons , then Definition 2.22 is the Same as Definition 2-24 (2) Note that tree concatenation as defined above , is “nondoterministic” in the Sense that, for instanw , to obtain tca,eb,,--,meli> Afferent elements of 4, may be Substituted at different Occurrences of a, in oe + As usual, given a word W, We use W allso to denote the fanguage Aw - tt For tree Languages M,,..., Mg we also write alM, Me] to denote te, (My,..., Mg). This notation ir fully justified Since. ALM,...Mz] is the (String) concatenation of the languages a,[,M,,--) Me ana! 16 " Deterministic” tree concatenation of * wth at 2,,...,4, Could. be defined as {t = a Wo for aed, Cea ae nS z if asa a otherwise Gy for we & and ae, Wade > = Wor. ASD ww for lb oO, Poet ae. Ge Uy wae ly, aneln>- Tf n=1, Lj will also be To as Lala: 7 TE Uy--sly Ue Singletons, thar the Substitution I's called a homomorphism . MW (230) Exercise. Let 124, 44,--,4, €Z, ell different, a; fe forall teign, and LLy,.,L, STE. Prove thet yielaC L ) = Yield(L) , Where Q,,-am © Ze an Ljky, ln ee tree Languages Over EZ (and thus, string Languages over = v40,]5). WU (2.32) Exercise, Define the notion of associativity for tree concatenation , and show that tree concatenation is associative . Show that, in general, “deterministic tree concatenation” is not associative (ef, Remark 2:28(2)), We shalt need the following Special case of tree Concatenation. (2.33) Definition. let E be a ranted alphabet and ket S be a set of Squrboks or a tree Language . Then the Set of trees indexed by SF , deneted by Te (8), is defined inductively as follows . (O95 0-2) SiS C5) Gi) Tt &z1, ae By and t-te € Te(S), ther aLt,---t] € Tels). Note that Told) = Te Me 18 Thes, if Sis a set of Symbol , then Te(3)= Teus » where He elements of S are assumed to have rank o Tf 8 is a tree banguage over a ranked alphabet A, then Te(S) ise tree Language over the rankec alphabee ZUA. (2-3y) Exercise. Show that, forany 2€ 2, E(B) = Te. (Svias). M We close this Section with two general remarks. (2.35) Remark. Definition 2.5 of a tree Is of Course rather arbitrary. Other, equally useful, ways of defining trees as e Special kind of Strings are obtamned ‘by replacing alt, t%] in Definition 25 by » [G-- &je or at,--.&%] or [at,---te] or at, tz (only in the case that each Symbol has exactly one rank) or Let] (whee Lis a new Symbol for cach a) or a[tystes ote] (where “9” a new Sgulook ). Mi 2 19 (2.36) Remark on the general philosophy in tree Language theory. The general philosophy Looks Like this : 2o P/N (1) Take vertical string Ranguage theory (cf. Definition 2.21), (2) Generalize Ub to tree Ronguage theory , and @) map this into horizontal string Language theory via the yielel operation ( Definition 2/5) . The fourth part of the philosophy is GW) Tree Language theory is a Specific part of String language theory , illustrated as follows ; 6.— a a. a[bled]4] be w x oe Example : (4). (Vertical) String concatenation (2), . tee concatenation (3), Chorizontal) String Substitution (Sea Exercise 2.30) (4). @ ts a Special Case of | &) (Seo Exercise 2.31) Ms RO 3, Recognizahle tree languages. (3. one) Finite tree automata and regular tree qrammars Let us frist consider the usual firte automaton on Strings. A oleterministic fincte automaton (5 & Structufe Mz (Q,=,%,4,, F) , whee @ is Ue set af States, S the input alphabet, gy is the intial state PF ts the set of Final States and Sts a familys {i haex 5 where §,:Q>Q Is the transition function for the input a There are several ways to descrite the functioning of M and. the Language i& recognizes. One of them (see for instance [Sal , T. yy), is to describe explicithy the Sequence of Steps taker by the atomaton while processing Soma inpuk string » This point of view will be considered in (4) Another way is to give a recursive definition of the effect of an input String on the State of M. Since a recursive definition (3 in particular suitable for Generalization to trees, det us Consider one tn detarh. We define a function : E*—» Q such that, for wez*, Sw) is intuikively the state M reaches after processing w , Starting from the inctial state dp: & 8= 4, “ for 3 and AES Equa = 8, (8(w)). The Language recogueed by M és LOM) = {wes* | Sener} When Considering this definetion of $ for “bottom-up” Wwonadic trees (See Definition 2.21), Me easily arrives at the Following generalitahon to the tree case 2) Thare Should be a Start state for each element of F . The fincte tree automaton Starts et ath Leaves (lak the Same time”, “in parelle|") and processes the tree ina bottom-up Fashion. The automaton arrives ak each node of rank & with a Sequena of & States (one state for each cirect Subtreo of the rode), ancl the transition function §, of the Libel a of that node is & Mapping % : QF + Q which , from that Sequena of & States, cletermnes the State at that noole . A tree is recognized (Ff the tree automaton is ina final state at the root of the tree. Formally (3.1) Definction. A deterministic bottom -up finite trea automaton is a Structure M=(Q,2,%,s, as whee Q is a Finite set (of States), Zits a ranked alphabet (of input Symbol) , S is a famity $58 e,4 acm, 4 Mappings SE. CEQ (the transition funchion for ac&%), Ss is a family (Slace, of Sees 6 Q (the initiah State for aeB,), and F ois a subset of Q (the et of final states). The Wapping $: Te —» Q is defined recussivels as follous: ©) for a€S,, Sa)eSa , Go for kei, aeSp and Hy--e ETs , Flare, ted) = 5, (8), 8D) : The tree Lang age. relognited by M | denoted by £¢M), is define to be {te Te | Se)e Ff. Me Tatuitively, 3(t) is the state reached by M after 22 bottom-up processing of t For convenience , when & iS understood, We shal! write Sy vather than S¥. Note therefore that each symbol 2eF may have Several transition functions 2 (one for each of ts ranks ) We shall abbreviate " finite tree automaton” by “fta”, and “deterministic” by “det.” (3.2) Definetion. A tree Longuage L is catlect Tecognizable lor reguhar) if L = L(M) for Some dot. bottom-up fta M. The class of recognizable tree Languages with be denoted by RECOG. MU (3.3) Example. Consider the det. bottom-up fla Me(Q,Z, 8,5, F) , where R= $5128], S142 79h, Sy eft 4}, SE a(mody), Faith , and FT, and 5, (both mappings Q? +» Q) ore addition Module y and multipbrcation wodute Y respectivels . Thar M recognizes the Set of alt “expressions” whose Vale module Y 1S 7, Consider for instance the expression +[+[o7]*[2 «[73])]. the prefix form of (04%) +(24(FHS)) The State of M at each node of the wee is indicated betwean parentheses in the following picture : +01) + @) * (2) a oe) (3) 2@) # iN 23 (3Y) Example, et %=fa} and Fy ={b3, Consider the Language ©] atl trees in Ts which have a “right comb-hikte “ structure Fike for instance the trea bLablablab[aa]}}]. This tree Language is recognized by the dat. bottou-up fta M=(Q,=,5,5, F) , whee QefA,C, wy, =A, Fe{Ch and Ys defined by 5,(A,A) = 3 (A,C) = C ane $(4y,4,)= W for all other pairs of states (4,,42) MU (3.5) Exercise, Let S =< fa,63, E,={p} and Z- tpg}. Construct det. bottom-up firete tree automata recognizing the following tree Languages : () the Language of att tres t , Such that (fa node of t ts hibeled q , then ibs descendants we Aabehd ee (it) the set of all trees t Such that yield art, Wii) the Set of alt trees € such that the tote number of p's occurring In t is odol | Wl A (theoretically) Convenrent extension of the deterministic fincte automaiten sto make ub nondater— ® automaton (on Strings) is a. structure Mz (02,5, SF), where Q,E and F are as in the deterministic Re, Sts a set of initial states , avrol oe aes, Be is @ Mapping Q—> PCR) Cintuctively, (a) is the Sek by States which M can possibly, rondaterministicallg, enter when reading 2 in state gq). Again & mapping! $ , now from = into FQ), can be defined Such that | for for we S*, $a) is the Set 2y of States M can possibly reach after processing Ww, having started from one ef the initial states th S : @ ie S (i) For we Z* and ack, Fcwar = U45,(a) } qe Son} the Language recognized by M is LOM) = {wes* | SW) F ¢ Bd. Generalizing to trees we obtain. the following definition . (3.6) Definition. A nondeterministic botton-up fincte tree automaton is a s-tuple Mz (Q,=,8,5,F), where Q,Z and F are as in the determinuste case Ss family {Slnce Such that 5,6 Q for each a6, ana 9 isa famidy 48E Jey, weep ef mappings Oe ce = c@)) The. tapping % : Te > PCQ) ts defined recursively bg () for aed, fae Sd, , (@) for Rei, ae Ey ana t,. % E Te Blatt, -tel)= UL&la,,-van) la, €8(&) for astsk} . The tree Fecogni zed by M , denotest by LM), ic L(M)= SteTe | EN Fg oF. fl Note thet, for ZeGh, Of (F) may be empty . (3.4) Example, let Sy afp} and By = {a,b}. Gnsi the following tree Language over = Le fualals,e10lt]]ue | Myre iLIy, 5b ETS $ ULublblsselblet Tur |. ; 3 Th other words , L is the fet of atl trees Cortana, a ae configuration & \ or a Configuration AK iy — aes Lis recognized by the rondet. bottom-up fta Mz(Q,%,3,5,F) , where Q=fa.,a,,a.7} , Sp=tad, Fer} ane 84s4s)=445, dah » % C4514) = 44s, a0F » $2 (4a dad = SCay.a) = 103, for ell geQ : Slar)= Blnd= Elan=&na)=tr3 , and Bayde) = GB for alt other possibitities . W Zt is rather chvious in the Last example that we can find a ceterministic bottom-up fta recognizing the Same Language (find &!), We now show thet this is possible in general (as in the Case of Strings). (3.8) Theoreur . For each nondeterministic bottom-up fta we can find a cleterministic one recognizing the Same, language . Proof. The proof uses the “subset- construction” , well known From the String ~case. . let Mz(Q,2,8,5,F) be a noncket, bottom-up fta Gnstruct the det, bottom-up fle M,= (Q),=,%,5,,5) such that forall 2€5, (5).=S , F,=4Q,¢HQ)| Q,0F ¢ bd}, ane, for ce ky and Qy,-- Qe SQ, Ba (Bay ~/ Qe) = FEL, 44) | ee Q forall tecek] Te is Straightforward to show, using Definitions 3.1 and 5-6, that ; for all te Te , E(t=80e) (proof by induction on t) From this t follows that L(M,)=ft] ¥cte Fj} “top-down fashion 26 = {t |8 nF eof = Lm) W (3.9) Exercise. Check the proof of Theorem 3.8. Construct the, det. bottom-up fla Corresponding to the fta M of Example 3.2 according to that proof , anc Compare this det. ftx weth the one ou found before UW Let us now consider the top-down generalization of tHe Fincte automaton. lee M=(Q,E,%,9,,F) be a det. finite automaton. Another way to define LUM) Is by giving + recursive definction of a mapping 5: Z*—> PQ) Such that intuitively, for each we SE", B(w) is the Sek of States q Such that the machine M , when started In Stebe q , enters a final State after processing W. The definition of Fis as follows : G) bA)s F Wi) fer we S* and aes, Saw) = fa] &laye Bw)} (the ast Line may be read as ; bo check whether , starting ing, M vecognizes aw , Compute 9,2 %,(g) and check whether M recognizes wW when Starting in g,) . The language recognized by M is UM)={wed*| a, ¢ E(w}. This definition , applied to “top-clown” monacic treas , heads to the following generalization to arbitrary trees. The fincke tree automaton Starts at the root of the tree in the inctial State , ancl processes the tree ina The automaton arrives at each Noda in One State ,and the transition function $4 of the Label a of that Node. is a Mapping 7 OF Of (where & Lg! 22 is the rank of the node.) which , from that State, determines the State th Which to Continue for each direct descendant of the node (the automaton “Spits up” into € incepenclent copies, one Copy for each curect Subtree of the node). Finally the automaton arrives at ald Aoaues of the tree, There Should be a set of final States for cach element of By. The tree is recognized if the fte Grrives at each deaf in a State which tc final fer the Lebel of that Leazt. Formally : (3.10) Definition. 4 deterministic top-down fircte: tree automaton is a S-tuple M= (Q,=,8,9,, F) , lohere. Q is @ finite set (of States) , Sts a ranket alphabet Cop input Symbols) , & is a famihy ee of mappings BE: Q + Q* (the traasition function for ae), a, 8 in Q (the intial State) , and F ois a fomiky {iloes of Sets OQ (the Set rt final states forae &). The wmepping &: Te —» P(Q) it defined recursively by () frees, fash (it) for €z1, ae Sy and &,..,% € TS, Blare,---4I)= fa | Rqe Fer x BCHDY. The tree uage Yelognized by M denoted by LM), is define to te {teTe|q,€ FH}. MU Tituitivel,, Sit) is the Set of States q Such that M, when Starting at the root of t in State q , arrives P(Q) It Aetined recursively as pllows © fracs, sae, (GO for kp1, ae Se wet &,-.-t% €Ts, Bare De $9 | Fay ape Bla): ase dehy Prall tstekh . The tree Longuage recognized by M , denoted by L(M), is LOM)= fteTs | Ba S # A}. 7 We how show that ,nondeternainistically, there if Ro difference between botton)up or top-down recognition. 1 30. A tree Language ic recognizable by a noncet. bottom-up fta Wf uw is recognizable by a nondet. top-down fta - (3.7%) Theorem . Proof, Let us Sag that a hondet bottom-up Fta M=(Q,2,8, 5, F) end & nondet. top-down fta N=(P,4,4¢,R,G) are Yassociated” rf the following tequirements are Satisfied : ( QP, 2-4, FR and, for all ae%, ah 3 (i) for all £21, 65 R and Qy,--4K,9 EQ, 6,9) FF (dy te) © Mal) Ta that case, one can easily prove by induction that G = fe janct fo L(M)=LCN). Since obviously for each nondet. bottom-up fta there Is an Associated honelet- top-down fta, and vice versa, the theorem holds My Thus the classes of tree Languages recognized by the noncet. bottom-up , det. bottom-up and ronclet. top-down fir are all equal Care are Callea RECOG), Whereas the class of tree Languages fecoghized by the olet, top-down fta is a proper sub-dass of RECOG. The. next Victim of genernltation 's the regular grammar (right-linear , type -3 grammar). Th this case ib Seems appropriate to take the top-down point of view only. Consicer an Ordinary regular grammar G=(N,2,R,8). All rates have either the. form A> wh or the form A>w , whee A,B EN. anol we &* . Monadically, the String WB may he considered as the result of treeconcatenating the tree We with Bat e, 37 whore Bis of rank 0. Thus we can take the generalization of Strings of the form w8 or w to be trees in Th(N) , whore A tsa ranked alphabet (for the definition of Tg(N), See Definction 2.33) Thas, het us considera “tree grammar” with rules of the form A> t , where AEN and te Ta(N). Obviousty, the application of a rule. Ast toa tree SET (A) shoubet intuctively Consist of repkacing One. Occurrence of A in S by the tree & . Starting With the initial nonterminal , nonterminals at the frontier of the tree are then Tepeatedly Fepkaced by right hand Sides of rules, Until the. trea does not contain Nonterminals any more. Now, since treas ave defined as Strings, U turns out that ths process tr Precisehy the Way a Context-free grammar works . Thus Wwe Arrive at the following formal definition . G.18) Definition. A requbar tree grammar is a tuple Ga(N,2,R 5S), where N ts a fincte set (of honternvinals ), S is a ranked alphabet (of terminals), such that Enn=g, SEN is the Initial nonterminad , and R isa fincte Set of rues of the form Ast with AEN and te Te (N) The bree Language generated by F , denoteol by LCG), Is definad & be L(H), where H is the Context-free grammar (N, ZufL,]},R, 8). We shath use. Pad & (or => and Sy whan G is uundarsteod) to danote the restrictions of Fd and - to T3(N). WU G19) Exampde . let Z=Sabedel, Z={p} and Zs=4p.qy Consider the regular tre grammar 32 Ge (N,F,R, 5) , Where N={S,T} and R Consists of the rules S—» plata], T+ ql{ep[dT]b] ad Te. then G generates the tree plag{cp[dejb]a] as Follows B => platal > pleatep[ATJb]a] => plaalepldebja] Or, pictorially, a => as = Fx 2? AN? IN aTa a qa eo qa ales ales cP wR b dT a The tree Language generated by G is 4 plocateplaecIbiva] | nye}. We (3.20) Exercise, Write regular tree grammars Generating the trea Aanguages of Exercise 3.5. Uf As in the Case of Strings , each regular tree grammar is equivalent to one that has the property that at each Step in the derivation exactly one terminal Synibe€ is produced. (3.21) Definition. A regular tree grammar G=(N,Z,R,5) is in normal form , if each of as rufes is etther of the form A» a[&,--&] croft the form Asb , where ket, ACS, A,B), REN and bes, Ww (3.22) Theorem, Each regular tree grammar has an equivalent regular tree grammar jin normal form Proof? Consider an arbitrary regular tree grammar Ge (N,Z,R8, 8). let G=(N,2R,,5) be the regular Bz Uf ant ody if t¢eN and there is a Bin N such that ASB and (Bt) ER,. Then L¢G,)= LCG) , anh R, does not contain rutes of the form A+B, A,BEN. (Ths is the Well known procedure of Temoving rules A868 from a Context - free grammar). Suppose that G is not yet in normal form. Then thore is a rule of the form Fears alt tp tI tree grammar such that (A+t)eR, such thet t,¢N. Construct a new regular tree grammar G, by adding a new nonterminalh B tN and replaang the rele Ar alt te] by the two rates A walt -- 8. t] ant Bot; in R,. Pt should be clear that WG)= WG) , and that, by repeating the Latter process a Fincte number of times, one ends yo with an equivalent grammar tn normal form. (3.23) Brercise. Put the regular tree grammar of Example 3.14 into normel form. Wd (3.24) Exercise, What does Theotemn 3.22 actuatl, Say in te Case of Strings (the monadic case)? Th the next theorem we Show that the regular tree Grammars generate exactly the class of recognizable tree. Languages . (3.25) Theorem, A tree Language can be generated by a regular tree grammar iff i Is an element of RECOG. 34 Proof. Exercise . MU Note therefore that each recognieable tree Language is a Special kind of context-free Language . (3.26) Exercise. Show that ath finite tree Asnguages are in RECOG. Y (3.27) Exercise, Show that each recognizable tree Ranguage con be generated by a “backwards deterministic regulsr tree grammar. A regular tree grammar is Called “backurards deterministic” if (4) c& may have more than one intial nonterminal , (2) & is in normal form , and (3) rukes with the Same right hand Side are equal WU Te is naw easy to show He Connection betwen recoguizeble tree Rauguages and Context- free Languages (3.28) Theorem. . yiela(RECOG) = CFL Cin Words , the yield of each recognizable tree Language Is context-free, and each context-free Language is the yield of Some recognizable tree Language) Proof, lek G=(N,%,R,5) be a regular tree grammar Considor the context-free grammar G=(N,%,R,S) whe R= fA—> vieldct) | At in RY Then LCF p= yield (LCG) Now 426 Ge(N,2,R,S) be a Context-free grammar let # be a new symbol ant Let A= Zufe,«} be the Tanked alphabee such that A,= Zufe} , and, for kot 3s Dg = 5} if and onty if there is a rule in R with a night hand side of Length &, Consider the regular tree grammar G=(N, A,R,S) such thar G) if Avw isin R wea, then A> #lwy is in RL Gy If A+ isin R, thn Are isin R Then yietdCL(S)) = LG), MU Tie the next Section we Shall give the Connection Getween regular tree Languages and derivation trees of Context - free Lenguages . (3.29) Exercise. A context-free grammar is “invertible” if rues with the Same right hand Side are tqual. Show that each Contertfree Language can be generatect by an invertible’ Context. free grammar | Mu For regular string Languages a Useful stronger version of Theorem 3.23 Can be proved. (3.30) Theorem, let & be a ranked alphabet . ryt Ris @ regular string Larguage over ZF, then the tree Language {te Ts | qeldct) €R} is recognizable . Proof. Let M=(Q,E,3,9,,F) be a deterministic fimte automaton ecogniting R. We Construct @ noncleternuinistic bottom-up Fta N=(QxQ,=,pH,5,G) , whith, for each tree. t , checks Whether a Successful Computation of M on giekt(t) is possible. The States of-N ave pairs of states of M. Tntuitively we want that (q,,4,)€ f(t) if and onty if M arrives in state G2, after processing 36 yield (t) , Starting from state q,. Thus we clefine & for ate 26%, , S={(4,a2)| Ula)=a2$ > (i for att &z1, 26 SR and States a,,q., sd €Q> P6410), (4s, 44)o-+ ater G)) = 1 Gad} “fF Vere dary FOP { ate tstsk-1 @B otharwise Then LOW) = {teTe | gielacene RE. ld (3-31) Exercise . Show that, ($f 2, ¢ gp, thar Theorem 3.30 hetds Conversel : if L is a String Language such that {teTs | gielacthe L$ ts recognizable , then Lis regubar, What Can gou Say in case Zz= fh? MW @. two). Closure properties of recognitable trea Rana Wages We Furst Consider Set-thonretic Operations . (3.32) Theorem. RECOG is closed under Union, Intersection anol complementation . Proof. To Show closure uncer Complementation , consider a cdoberministic bottom-up fta M=(Q,=,8,s,F). let N be the cet. bottom-up fta (Q,=,%,5, Q-F) . Then , obviousty, L(N)= Tz - LCM] To Shaw closure under Union , Consider two regular tree grammars G; = (Ne, E., Re, Se), C2 (with NAN,= A) Then G=(NUMU{S}, 2,02, RVR U{S 75, 5285,5 ) is a regular tree grammar such that L(G)= UG) UY). 3F Asa Corollary we obtain the following Closure property of context-free Languages. (3.33) Corollary CFL is closeol under intersection with regular Languages . Prot. leet L and R be a contert-frea and regular Language respectivels . According to Theorem 3.28 , there is a recognizable tree Language U Such that yield (U)= L Consequently , by Theorems 3-30 and 3.32, the tree Language Ve UN {ft} gielh(t) © R3 is recognizable Obviously LAR = gield(V) and So, egeuh by Thaorem 3.28, LAR is context. free. uM We now tum to the closure of RECOG uncler Concatenation operations (see Definitions 2.23 ancl 2.27). (3.34) Theorem. For every 43! Gul aeSz, RECOG is close under te, . Proof. Exercise. U (3.35) Theorem. RECOG Is closed uncler tree concatenation Prowf. . The proof is obtained by generabiting that fer regukar String Languages. let nz, a, 1-1 On € Seth different ana by byy-u ly recognizable tree Languages (we may assume that I 38 Gs (ON, =, R,S) , where r= Ru OR and Ry ts Ry with each rule of the form Ae rephaced by the rote A+S; (1edsn) “i (3.36) Grollary.. CFL is chosed uncler Substitution. Proof. Use Theorem 3.28 and Exercise 2-30 ad Note also that Theorem 3.35 is essential, 2 Special case of @rollery 3.36. Next we generalize the notion of oncatenation) closure of String Languages to trees , and Show that RECOG © dosed under this closure Operation. We shat, for Convenience , Pestrick ourselves to the case Chat tree concatenation happens at one element of Z - (3.33) Defintion. let aeZ, ant et L be a tree Language over Z. Then the tree concatenation closure. et L at a, denotes by LI* is defined te be ox, ohare X,={a} and, for nzo, Xnet = x fb ufay).t MU (3.38) Exaupdéo. Let G=(N,=,R,S) be the regular tree grammar with N=$S}, B= {a3,Z,= tbh and R=f{S—+blaS],S>a}. Then UG)={basi>., a. w The “Corresponding” operation on Strings has Several names in the Lcterature. Let us call d “Substitution T Recatt the notation Ly ob, from Definition 2.27. 3g closure” : (3.39) Definition, Let A be an aXhabet and aed, For & Language L over A the Substrtution closure of L ata, denoted by U*, is efinea to be OX, | where Xo= fas and, for nze, Xnar= Xn» (LUfas) u“ (ye) Exercise, let ae, arte, and et Le Te. Prove that gielt (L**)= (grett(L))** , a (3G!) Theorem . RECOG Is Closed uncler trea Concatenation closure . Proof, Agasn. the proof is a Strayght forwaral gererabization of the String case. let G=(N,2,8,S) be @ regular tree grammar in normal form , and tet AES, Construct the regular tree gramman G=(NU{S,3,2,R,5,) , where Re RULASS | Ava is in REULS HS, Sa} Then LOG) = (L(s))**. MW (3.42) Corollary. CFL is closed uncer Substitution closure - Proof Use Theorem 3.23 and Exercise 3.Yo. UY/ Te is well known that the class of regular String Languages is the Smallest class containing the finite Languages ane Closed uncer Umion, Concatenation and closure. A Similar result helds for recognizable tree Languages ‘° (3.43) Theorem, RECOG is the smaklest chass , of tree Languages Containing the fine tree Languages and closed under Union, tree Concetenation and tree concatenation closure Preog above conditions in Exercise 3.26 and Theorems 3.52 3.35 and 3.Y/. We have shown that RECOG- Satisfied the Lt remains to Show that every recognizable tree Cargquage can be built up from the tincte tre Languages using the qperations U, . and ta let Ge (N,E,R,S) be @ regular tree grammar (& is easy to think of it as being in normal form ). We Shale use the elements of N to do tree concatenation at fr PQ oN and hen, Lt us denote by 4,P Set J alt trees te TS (P) for which there is derivation Ade sks the > St (npe) Suh that, for teen, ty € Te (QUP) and a rete with Loft hand Side in Q ts applied to t; to obtain t,,. We shall so, by induction on the Cardinally of Q, that atl sets cn be built up from the finite tree Languages by the cperations U, eg and *™ (forall BEN). For Q= 8, LE is the seb of all these right hand sides of rates With Left hand Side A, that are in Te (PJ), Thus LR, i 2 finite tree Language for att A anct P. Assuming row that for QC N, alt sets LR can be built up from the rete tree Leuguages, the Same kets for ale sets LEE whare Be N-Q , Since : quis? Q L = Lo pute) 6 ( 2.) ‘s Lop (a formal proof of this equation s Keft to the reader). Thus, Since L(G)= ve , the theorem /s proved. // y Tn other words, each recognizable tree Language can be cenoted by a “legubar expression” with trees as constants and U, wo, and A as Operators . (3. 44) Exercise . Try te fird a regular expression for the Language gerverated by the regular tree grammar G=(N,F,R,S) with N={S,T}, Zp=-4a}, Zp=ph and R={S-—+plTS],S +a, T+plTT], To}, Use the algorithm in the proof ef Theorem 3.43. MU As a Corollary Wwe obtain the result that att Context free Languages Can be olenoted by "context-free expressions. (ys) Corollary. CFL is the Smallest class of Languages Containing the Finite Languages and closed Under union, substitution and Substrtution closure Proog. Exercise MM (3.46) Exercise. Define the operation of “berated concatenation ak a” (for tree Languages) ana " erated Substitution at 2” Cfor String languages) by ite (L)= U*., gb. Prove (using Theorem 3-43) thet RECOG is the Smallest chass of tree Languages Containing the fircte tree Languages and hosed under the operations of unon, top concatenation and berated Concatenation. Show that this implies that CFL is the smallest class of languages ga containing the finte Languages and Closed lundar the operations ef Union, Concatenation anc erated Substitution (cf. [Set, Z.17]). Uy) let us now turn to another operation on trees that of relebeling the nodes of a tree. (3-47) Definition. Let = aud A be rankec alphabets. A relabeling r is a family {trleso of mappings Me: Ze —» PCLg). A relabeling determines a mapping 7: Ts > PCT) by the requrements G) for aez, r(aj= g(a) , «iy for kei, aE Se ond &,.,% € Ts, P(aLt, ---te]) = [bIs,--se] | bela) and sp € re) }- ea for each k20 and each AEZye , |la) consists of. one element only, ther r is called a projection. Obviously , RECOG is chosed under relabelings. (3.48) Theorem. RECOG is closed under relabebings . Proof. Let r be a relabeling, and Consider Some regular tree grammar G. By replacing each rule A-»t of G by all rubs Ass , ser(t), one obtains_a regubar tree grammar for r(L(G)). (Ta order that "“P(t)" makes sense, we define PCB) =[8} for each nonterminal B of G ). My Y3 We are now in a position to Study the Connection betwear recognizable treetanguages anol sets of derivation trees of context-free grammars, We shalt Consider two kinds of derivation trees. Furst we clefine the “ordinary” kind of elerivation tree (ef. Example 1.7). (3.49) Definition. let Ge (NERS) bea Context. free grammar, let A be the ranted alphabet such that A, = Buse} aud, for koi, Ag is the set of honterminels AeN for which there Is a rule Aw with |wl=k (th ase k=4: [wl=4 or lwi=0). For each LENuUZ , the Set of derivation trees with top x , denoted by DE | is the tree Language over D defined recursively as follows ({) for each & in &, ae Dz j (ey for <2ch rule A> of -..0 in R (mgt, AEN, %e EUAN) ig te DE for rsésn, then Alt,...t,] € Df ; (iy for each rut A>2 in R, Alele DS. UW (3.50) Definition. A tree Language L is Saye to be hocek if, for Some Context-free grammar G=(N,2,R,S) FS Aud Some Set of Symbols VEONUZ, L= UTE. Mi (3.51) Exercise . Show that each Local tree Language 1s recognizable. “ Note that « Local tree hanquage is the set of ab clerivation trees 6] % context-free grammar which has a Set of inctial Syubots (instead of dre intial nonternunall), “4 The reason for the rane “local” ts that Such « tree Language L 1s determined by trees of height one Syurhots” , () @ Prrite Set of ) (2 & finite set of “Initia G) @ frute set of “Final Symbots”, ana the requirement that L consists of ath trees t Such Hut ach node of t together with us durect descen— oclants belongs to (1), the bop Lebel of t belongs to (2), and the Leave Labels of t te (3). We now show that the class of Local tree Longuages is properly inckuded in RECOG, (3.52) Theorem, There are recognizable tree Languages which are not local Proof. let 2y= 14,63 and =~ {5}. G@rsicler the tree Language L=$ S[S[ba]Slab]]}. Obviouty L is. eCognitable, Suppose that L is Local, Then there is a Contextfree grammar G such that DE = L. Thus BOSS, Soba anh Ssabh we ruts of G But then S[S(2b1SpbaI] € L. Contracretion tT YW Note thet the recognizable tree Carguage L in the above proof can be recognized by a deterministic top-down ftw . Note also thet the tree Language given in the prof ef Theorem 3.1y is focal. Hence te Local tree karguages aid the tree Lunguages recognited by det, top-down ta are Incomparable. T Other examples are. for instance { S[T[a]TLbI]} and {SCS [el]} - Ys (3.53) Exercise. Find a recognizable tree Langurge which is necther Local ner recognizalle ty a det. top-down fta. Me Te is clear that, if B= {a,b} and B= {852,53}, tha, U=§ S,[S,[ba]Sz[ab]]} is a Local Rengrage . Hence the Language L in Theorem 3.52 1/5 the projection et the Local Language L’ ( project S, 5, and Sz on S). We well show thet this is true tn general ; each recog- nizable. tree Larguage Is the projection of a Local trea Language . Tr fact we shall show a sOghtly Stronger Fact To de this we define the Second Gye of. derivation tree of a context-free grammar, Called “rule tree” (3.54) Definition, Lee G-(W,E,R,S) be a Context- free grammar, let R be aug set oJ Synbets in Ore-to- one Correspondence with R, R=fLF]) reR}. Each Cement of R is given & rank such that, (If rR is of the form A >w,A,w,A,w, ... Age (for Some Keo, Bi AE EN aud Wy,w,,..,we €S*) then Pek The Set of rule trees of G , denoted by RTC(G), 's defined to be the tree Linguage generated by the regular tree grammar G=(N,R, PS), where P is detinet by © tf r=a(AswA,--ue, Ky) , ker, Is in RR, then A» FLA... Ag] is in P; Go if re (A-2%) tha A+F. is in P. “i v6 (3.55) Definition, We Shalt Say that a tree Unguage Lois a rude tree Language "f b= RTCG) for Some Contert- free grammar G. ll Thus a rute tree is a derivation tree in which the nodes are Labeled by the rules apbied during the derivation. Tt Shout be obvious, that for each context-free Grammar G=(W,S,R,S) there 5 a one-to-one Cortespondente betwean the tree Languages RT(G) anc DE (3.56) Example . Consider Example 4.9. For each rude rin that example, Le (r) Stand for a. new Synebel. The rele tree “Corresponding” to the derivation tree disphayed in Example 4.7 Is (SAD) (A> AA) (D-> Ddd) es \ (A> bAa) (A aAb) (Dd) | 7a”) (AA) Note that this tree is obtained from the other one by Viewing the buslding blocks ( trees of height one) ef te Locah tree as the nodes of the rule tree. Wa The frelewing theorem shows the relationship of the rule tree Languages to those defined before. oF (3.57) Theorem. The class of rule tree Languages 'S properts, included in the Intersection of the Chass of Locak tree Languages ait the class of tree hinguages recognizable by a clet. Cop-down fta Proof, We frrst show inctusion in the Chass of Locek tree Lurguages. Let G=(N,E,R,S) be a context free grammar and Raf |reR}, Consider the Context. free grammar G,=(R-R,R,P, -) where P is defined as fothlows » tf r= (AswA,w,--Agwe) , Kou ts inR, the, Fs. is in P pratt rakes G0, BER Such that the Left hand Side of nr; ts A; Cascsk), lee VelF | reR has Rt hand sida S{, Then kre) = DE, ant hance RTCG) ss tocat To show that RT(G) Can be recognized bya dot. top-down fta, Consider M=(Q,R,8,4,,F), where Q=Nvufw}, %=8, for FER, FL consists of the Lett hand Side of r orbg | anol for FeRe, of the form A> Wo Ayw,... Agwe , 82 (A)= (Ar, ---, Ag) and 8, (B)=(W,--,W) forall other BEQ. Then LOM)= RTCG). To show proper inckusion, tet H be the Context— free grammar with rues S»SS, S+aS, S-»Sb and Sr ab. Then Dp is a focal tre Language. Te is easy to Jee that Dj Can be recognized 4, a det. top-clown fta, Now Suppose that Dy = RTCG) for Some. Context free grammar & Sine S has rank 2 and Sine the Configuration ss occurs in Dy i Sts the name of a rule of G of the form A> u,Aw,Aw,. Now, Sinte a2 ant b are 4] rank o 48 and Since M isin Dy » & ant b are nanas of rues Aww, and A> Wy. Henge S[ba] is a rete tree of G. Contradretion. WwW We now characterize the recognizable tree Languages in terms of rule tree hanguages . (3.58) Theoreun. Every recognitable tree Language Is the projection of a rule tree Language - Proof, lee G=(N,2,R,S) be a regular tree grewsner in normal form. We Shale define a regular tree. grammar G and a projection P Such that L(G) = pCLCG)) anc LOE) ts a rule tree Rangquage E well simulate G , but ES will put all tn formation about the rules applied curing the derivation of 2 bree o Inky the tree itself. This is a Useful technigue . let B be a set] Sqmbols in I-1 Correspondance, With F Gon tee GCN ees), The reneing of R, the set Po rules ant the projection p are clefinad Simubtanzously a5 Fothows s () tf reR is the rule A+alls,--.Bg], than Fohas rank k, A> F[B,...Bg] ts tin P and re(F)sa ; @) if reR is the rule Asa, then F has rank o, A-»F ts in P and p(F)=e. P(LCG))= L(G), Now note that G may be Viewed as a context-free grammar (Over EUGC, 1}. Ta fect, S is the Same as the one Constructed Le ts obvious that in Definétion 3.5y! Thus L(G) tsa rule tree Language. 7 49 Since RECOG is Closed uncer projections ( Theorem 3.48), Wwe now easily obtain te frtlowing Correlary (3.59) Corollary . For each tree Ranguage L the. following four statements are equivalent : G) Lis recognizable wa LC Is the projection of a rule tree language (its Lis the projection of a Local tree Language. cy Lis the projection of a tree Language recognizable by & deterministic top-clown fir. ll G.bo) Exercise Show that ,in the case of Focal tree Languages, the projection involved in the above Corollary (0) can be taken as the Identity on sgmbots of rank o (thus the ylelds are preserved). /// As a Pinel eperation on trees We Consider the notion of tree homomorphism, For Strings, a hono— morphism h associates a String Rie) with each Symbsl a Of the alphabet , ann trhusforms © String aay. .aq lato the string Ala,). Ala)... Ra,). Generalizing this to, trees, & tree homomorplitm tw associates a tree h(a) with Lach symbet a of the rinked alphabet (actually, one tree for each rank). The application of k boa tree t Consists in replacing each Symbst a of & by the tree Rla) and tre Concatenating atl the resulting trees, Note that if a is of rank & , thar tla) Should be tree Concetenatet With k other trees i therefore , Sinte t¥ee Concatenation Mépptns et Symbots So , the, tree Aa) Shoulh Contain at Least & different symbols of rank 9. Since th generat, the number of Squnbets of rank 9 In Some ranked atphabet may be fess than the Yank of Some other Symbet we allow for the Use 1] ae arbitriny Number of auxiliary symbols of Trank-o, Called “Variables” (recaht the use of nonterminals as auxiliary symbots of rank o in Theorem 3.43). of rank © (3.61) DeFinekion Let X}4, %,%3, be an tn ftrnite Sequince of Ufferent symbets, celled variebles. let He 1% %%3,--$ , for Kat, Kye bere, Meh and Kye BD. Elements of K. will also be clenoted by x, yond 2. W/ (3.62) Definition. Let EZ and A be ranked atphatets . A tree homomorphism f isa famitg {hg}, of mappings hy: Ze > Gey): A tree homomorphism determines a mapping Rh: TS —> TA as Foblous : © traez, As %,@) ; Wi) for Kei, 26 By and §,.-,% € TS, AC ALE eT) = Fig <% A(E), «Mee ALE) > | Te the particular case that, for each ae Zy, Ryca) does not Contain two Occurrenes of the Same x; (v2 1,2,3-.), & is Called a binear tree homomorphitm, fp A General tree Aomomorphitm fh has the abibities of deleting (hy (a) does not contain X;) , ee == —= = ed Xi) and Permating (Cif t t , then te BS ant Ast isin R. Hen Awt(t) is in R, and ne AB tb, Now, Suppose that the frst Step sn Ast results from the apphication of a rule of the form A> o[B...&]. Then Ase vs of the form Ae alk, 41S t. It frtlows that t rs of the form a[t,...t¢] Such hae 8, St for ath tscgk Hence , bg induction, B; pee). Now, ance the rule A > By (a) SX, eB, Xp BD "Sin R by Aefinetion , we have ( prove ths!) Be Ry lA) <0 © By Mee HD - Relalen MG), 4 ALY) = (alt, GI) = 44). (2) The proof re by induction on the number of Steps in ADs. For zero Steps the Stetemant is trivially true. Suppose that the first step in A Ps resubts from the application of a rule A» t,(a) for Some a in B . Then A(a)e Ss and Ada Now Suppose that the ferst Step Yesublts from the application cf a rule A > ty la)alG,-- Be] is @ rule rf. G . Ther the derivation is Ae hela = s, At the pont we need both fneerity of & ( to be Sure that each Be tn Fe Aen eB, EGe> produces at most one. Sub-tree of 5 ) anol the Concuition on G (bo deak wth cleletion; Sinee Fy larcxy eB, ..% — B> heed hot Contain a occurrence of By, we ned an arbitrary tree generated, in G, by 8; to be able to Construct the tree & Such that A(hes ). au: There exist tees S,,---,5% In Ta Such that Sa tg laycx, <—S,, Xe ese > and G) if x; cccuns in tela), then &; Ps : Ui) if Kp does not occur in Ag la), then = A(t) for some aebitrary t; Such that & > t; Hence, by induction art (te) , there are trees Gonut such that Rt) = 5, and By 2 t; for eh dacs Consequently , if t= alte], ther A @alG,--- 87] & ely. &I = t and Att) = Fe (CX, RG), 0 AFR) = this proves the theorem MW (3.66) Exercse . the String Case Me Can also prove tat te ‘regular banguages ave closed Under Romonorphasms by Using He Kleene characterization theorem. Give an abternative preof of Theorem 3.65 by using Theorem 3.y3 ( Use the fact, whith is Impheat in the proof of that theorem, that €ach regular tree Language over the ranked. alphabet = can be build up from ee tree Languages Using operations U, a Me As an indication how one Coulee “Se theorems Cite Theorem 3.65, we prove the following theorem , Which is (Lightly!) Stronger than Theorem 3.28 (3.62) Theorem. Each context-free Language over Lis the Geld of a recognizable tree hanguage- over Z, where B= AUfe} and Za = x}. ss let L be « context-free tanguage over D By theorem 3.29, there ts a recognizable tree Language U over Some Vouket abphehet S2 with 2,2 Avfe3, Such that giele(U)=L. let R be the Linear tree homonorphusin from Tp into Te Such that ti (ajsa for ath a in Avfe3 By la)e X, forth aw in 2,, hy (ad= * (xX, #(x,*[ - By Theorem 3.65 Proof , and DX Xe 77) for all aeX , kz. , RLU) is & recognizable tree Language Te is easy to Show that, for each t in Tp, Gield (ACE) = gietl lt) over & Hence. qietl(A(U)) = wet (Uz L WU Note that Theorem 3.62 1S equivalent” to Ke fact that each context-free Lunguage Car be generated by a Context free grammar in Chomsla, normal form, (3.68) Exercite, Try to Show that RECOG is closed uncer inverse (hot necessarity Linear) Romo- nuorphisms ; that is, 1f Le RECOG and KH is 4 tree homenorphism , ther R'CL)- ft] Alte Lb is recognizable . (Represent L by a deterninistre bottom-up fee ) Me We have now ciscussed ate AFL operations (see [Sal, TZ ]) generabjeed to trees : : tree Concetenation , tree Concatcnation Clesure, tree fomomorphitnn , inverse tree Romomorphitm and inter— Union Section with a recogniteble tree Language Attording $2 previous retults, Ths , RECOG is a “tree AFL” . x4 (3.64) Exercise. Generalize the operation of string Subsbitution (see Defintion 2.29) to trees, and Show that RECOG is Closed Under “Linear tree Subst tut/on ”. “is (3-70) Exerase . Suppose gou don't know about context-free Grammars . Consider the notion of Fegutar tree grammar, Give a recursive leprition J the Felation => for such a grammar, Show that, 'f alt, %] Ss, the, there are S,,..,% Such hat Sa a[S,... 4] td #; ss; fr ale telek. Which of He two cefinetions of regular tree grammar clo pow prefer 2 lM (3. three) Decidabibity . Obviously the membership problem for recognizable tree Ranguages is Sotvable : given a tree t and an fta M, fast feed t into M and See tohether € IS Yecognized or not. We Row Want to prove that the euphness and finiteness problems for recognizable tree Languages are Sotvable. To elo this we generabize the Pumping Lemsre for regular String Carguages to recognizable tree laguages: for each regular String Language L there is ar integer p Such that for att Strings @ in L, if l#/ zp, then there are Strings k,v and w such that @ouvw, lvwi p- Considering Some path of maximal Aongth through € , 4 Is clear that there are trees &, &, ---) ty © Te CAx}) Such that nz ped, te they tye ots ty, the trees tg, Contain exactly one occurrence of X and have height 2 1., ond tye Ts Cts ts a “Bnear/zation” of t according te Some path). Now Consider the a Stetes Qo = 8ltr~x tn nat) for tséan. Then, 1 Das there Gre two egual Stetes + there are Cg Such that Wa ay and Aste < pt! let usb, auong 9), - ee Ge & sey bey, wt Whee ee ey Ther requirements (i)-(iv) in the Statement of the theorem ave ebvioushs Satisfied. Furthermore in general, if $05.) =8¢5q) , then s8 a * A a 8 (SS) 2 2054, 52). Hence, Since B(vaw) = &(w) , requirement (Vv) is abso Satisfied. WA As @ Corollary to Theorem 3.3] We Obtarh the plurping Leiuma for Contert-free Languages . (3.72) Corellarg For each context-free Aanguage L over A we con find an integer q Such that for au words BEL , if Jzlaq, ther there are Strings Uy My) Mos Ve Aral ty in AY Such that Bou,y, LY els gq], IY) >0 , and, br <¥ nso, 4, Vw, Uru € L. Weve 42 4 Proof. By Theorem 3.62 , dud the fact that each context-free Language Can be generated by a I- free context-free grammar, there is a recognizable tree language U over Z Such that gield(U) = L , where Z,=2 DB and S.-i}, Let p be the integer Correspon— ding to U aécording to Theorem 3.71 ,and put qu 2” Obviously, if zelL and l2leq , then there isa & int Such that Gield(t)=2 anc height (t) > p. Then, by Theorem 3.71 there are tres U,V aud W Such that (W)-Ww) in thet theorem hetd. Thus touyVve, Ww. LE Field(u)= U,xXu, , Giell(v)= V,XV_ ane Giclol(W) = W, (See (4)), Ther B= prelel(t) = = Pidd( Uv. Ww) = qieldlu) », Giell(y) -, grelel(w) = = U,V, Woy tg. Tt ts easy to See that alt other Fequrement ‘tated in the Corollary ave atso Satisfied. MM 59 (3.38) Exercise, let 2 be a ranked alphabet Such thet B,-5a,b}, Show that the tree language LE | petit) Aas an equal number of a's and b/s f is not recognizable i From the pumping Cemura the ceciclebihty of both enptiness ard fincteness problen for RECOG follows . (3.74) Theorem . The euptiness problem for recognizable tree Languages 15 decidable . Proof. let L be a recognizable tree Language , and ket p be the integer of Theorem 3.71. Obviousty (using n=o in point (v)) , L is Ronempty if and oat, /7 L Contains a tree of height p} is Yecognizable ( Exercise 3.26 ancl Theorem 3.32) , this is decidable by the previous theorem . i Note that the clecidabiLity of emptiness ane finckeness problem for context-free Languages follows from, these two theorems together with the bo “wiele theorem" (with e¢ 3, S,= A) As in the string Cate We How obtain the cleci habs Lty of inclusion of recognizable tree Laguage ( anct hence of equality ) (3.76) Theorem. Tt Is decidable, for arbitrary Fecognizable tree Lavguazes U atV, whether UV (aud abso, Whether U=V). Proof, Since U ts inckuded in Vi iff the intersection oF UL with the complement of V Is empty, the theoren, follows frou Theorems 3.32 anc 3.2y, UW Mote again that each regular tree Language ii a Speciak Kind of Contert-free Cnguage . Note also that inckusion of context-free Languages 1s not clecidable (neither equality) . Therefore ut is nice that We have found a sub-class of CFL for which Inckusion and equality are decidable. Note atse that CFL I's not chased under intersection but RECOG is. We shall now relate these facts to Some results in the Literature concerning “parenthesis Languages” ana “Structural equivalence.” of Contert- free grammars (see [Sal, Vill.3]). (3.32) Definition. A parenthesis grammar is a context-free grammar G=(N, ZUf{L,]}, R,S) such that each rule in R is of the form A->[w] Wweth, AEN and we(ZUN)*. The Language geverated by G is Called a parenthesis Language i TTS és To relete parenthesis Aanguages to recognizable tree languages, Lot us restrict attention to ranked. atphabets A such that, for k>1, 1f dp ¢ gp, then Ag = ft} , where * Is 2 fiked symbol, Suppose that ih Our vecursive, definition of “tree” we change ALG te] into [t,..t] (xe Definition 2.5 and Remark 2-35), Then, Obviousty, atl cur results about RECOG are stihl vedi. Furthermore , Since. # Is the onby Sgubrl 6] rank 37, We May as wele replace Coby [Dh ts way, each parenthesis Language ts th RECOG (in Fact, each parenthesis grammar ss a fegqular bree grammar). Tt is abso easy to See that , "FL is a recognizable tree Language (Over a restrictedt Vanked alphebee A), ther L-O, is &@ perentheris Language. From these Cons) - derations we obtain the following theorem. (3.78) Theorem. The class of parenthesis Ren guages is chose under union , intersection ancl Subtraction , The inckusion probleur fer perenthasis languages ys decidable Proof. The frist statement follows derecthy from Theorem 3.32 ara the Last remark above. The second statement follows cunectly from Theorem 3.3£. /// A paraphrase of this theorem is obteineck as follows. (3.74) Dedinition, For any Vanked alphabet , 62 fet p be the projection Such that plajoa for aL Symbots of Yank 0, anc pla)= » for ath Symbols of rank 21. let G bea context-free grammar. The bere tree Lujuage of G, denoted by BIG), Is PCD), where S ic the Initial nonterninad of G. We Say that two Context. free grammars G, and G, are Structurally equivalent iff they generate the Same bere tree Language ('e. BT(G)«B8TG)). Y Thus, G, and G, are Structurally equivalent if thecr sets of derivation trees are the Same after “erasing” all honterninals . (3.80) Theorem. It is decidable for arbitrary Context -free grammars Whether thay are Structurally equivalent . Proot. For any Context-free grammar G= (MZ RS) het [G] he the parenthesis grammar (N, Zv{l,I},R, S), where R={A—+[w] | Aw is in R}. Obviously , LULG]) = BT(G). Hence , by Theorem 3.38, the theorem bolas . fl (3.91) Exercise. Show that, for any two Contert-free Grammars CG, and G, there exists a context-free gremmar G, Such that BT(G,)= 8B1G)N SHG). (3.82) Exercise . Show that each Context-free Grammar has a Structurally equivalent Contert- free grammar that is invertible (c4. Exercise 32g). fff (eR é3 (3.83) Exercise. Consider the "bracketed Context— free languages” of Ginsburg ana Harrison 7 and Show that Some of their resubts follow easily from results about RECOG (Show frrst that each recognizable tre Language IS a deterministic Context. free kanguage). Ws (3.84) Exercise. Tnuestigate whether ub 1s decraable for an arbitrary Vecognizable tre kanguage R () whether Ris Local ; C4) whether Rots & ride tree Language ; litt) Whether Ris recognizable by a let. bop-down fee. le The results Of the next chapter ave prblished in: — SF Engelfriet ; Bottom-up and top-down tree trausclnters formations — a Comparison Math. Sst. Theory 9 (1995), 198-23) : -—~o. Engelfriet i Top-down tree transducers with resular hook-ahkead. ; Math, Syst. Theory 10 (1972), 29g - 303. (y.one) Lntroduction : Tree transducers and. semantics Tn this part we with be concerned with the notior of a tree transcucer 2 4 wmachine that takes a tree as input anol produces another tree. as output. In ath genorality we may view a tree transducer as a device that gives meaning to Structured objects (ie. & remanties defining device), let us try to indicate this aspect of tee transducers. Consider a ranked alphabet ZS. The elements of & may be viewed 26 " operators” , te. Symbets denoting operations (functions of several aropauents), The rank of an operator Stands for the number of arguments of the operation (note therefore that one operator May denote several operations), The operators o} Tank © hale ro arguments they are (denote) constants As ar example , the ranked alphabet = with Bo Se consisting of thre: constants <,4 and & and one binary operator f Frou, Cperaters we may form “terms” or “expressions” | Like for instance. fla, fle,b)) , oF perhaps denoting £ by, (axlexb)). Obviously the terms ere in one - te-one Cor ve spondence. with the set Te of trees over = Thu: the notions tree and term may be identified Trtuctively , terms denote Structured objects , ok tained by appkging the Operations to the Constants. Se 65 Foruratle , Meaning Is given to operators and terms by way of an “interpretation” . An interpretation of Z consists of a “domain” B_ for each element a € S, an element 4, (a) of B, and for each K31 and operator ae Sp an oe 4, (a): BoB. An ae ef S Is alse called a "S~albgebra r “akgebra of type Z” . An interpretation (8B, thtshas 7 thee ) cletermines a mapping k: Ts > B (giving an interpretation to each term as an clement 4 B ) as follows ; for aes, hays h(a) ; Gy for Kz7 and ae Spe , Raley) = Relar(Alt,), ., Rdte)) . (Such @ mapping is alse called 2 “homoncorghisin” from Into B ). Thus the meaning of a tree is uniquaty determina by the meaning of Hs Subtrees and the Interpretation of the operator apptied to these subtrees Ta general we can Say that the meaning of a Structured Object is «@ function of the. meanings of us Substructures the Function being determina by the way the object is Constructedh from ibs Substructwes . As an exanphe, aw interpretation 2] the above — mentiond ranked alphabet E-fe,ab f} might for instance Consist of a group B with Wwuity 4, le), multiplication 4,(f) and two Spee fic elements Ay(4) amet Ay(b). Or it might consist of B= 40,63", Alea , A (ayza, Acoy=eb and A, (Ff) I's @ncatenation, Note that in this Case the mapping hs Ts > B is the yobs | 66 Tt ts now easy to see that a cleternunistic bottom-up fta with input alphabet BS is nothing Ce but a Falgebra with a finite domain ( ts set of states). Such an automaton may therefore be used asa sewantics defining device in Case there are dnty a Finite Ruunber of pos: Semanticol values . Obviousty, in general, one heeds an infinte number of Semarticel values. However, & is Not attractive to Consicler arbitrary infinite domains Bo Sine this provides us With no knowledge about the Structure of the elements of B, We therefore assume that the elewents of B are Structured objects treas (or interpretations of them). Thus we Consider Z- algebras with domain Th for Some Canked alphabet OS. Our Compete Semantics of Te may then Consist of two parts » an interpretation of Te Inte Ta and an interpretation of Th in Some A-algebra . Tho inter- pretation of Ts inte Th may be realized bya tree transducer. Ar example of aw interpretation of Ts into Ty is the tree homomorphism of Definition 3.62. Tr fact each trea se Tp (&) may be Viewed aS an operation S$: TK + Th , defined by G(s es) soe 5 ge | A tree. Romomorpfusm is then the Same thing as an interpretation “of- Z With clomain Th , where the ahlowable interpretations of the elements of S ave the mappings § above. Note that these interpretations are very natural , Since the interpretation of 4 tree | bz is obtainect 44. “applying « finite number of A ~ Operators tp the interpretations of its subtrees" Tp show the Teleyancee of trey homomorphisus (anol therefore trea transducers in general ) to the Seurantics of ontert- free Aanguages We. Consider the following Very Sitple examply . (4.1) Exaurpte . Consider a Contert-free grammar Generating expressions by the rules E+E+T, Ts THF, Eve, Tora, Fira and Fs (CE). Suppose we Want to transhate each Expression Into the equivalent Post-fix expression . To cdo this we Consider the rule tree Language Corresponding to this grammar ancl Apply bo the rule trees in this banguage the tree Fomomorptism detinad by R(EsE+T). Elxx,+], RGIS Tee) | TUX, x, *], R(E»a)= Elfaz [ Oise) rel ACERS.) Frey 4, (Fs (ae ee rele tree Corresponding te an expression is trauskated into a treo. whose yield Is the Corresponding post. fix expression . For instance , (E> E+T) & oN : —™ (E24) (Tantery 8 inte pe a aTFE: eS I waa sothat Atraxa is transhated inten aaaxt. Note that , Wore over, the transformed tree Is the Aerivation tree of the post fix expression Ty the. Context free grammar with rules E+ET+, Ce tnstead of interpreting this derivation tree. as as 68 igield (the post- fix expression) | one wmaght also interpret ib as, for instance, & Sequence of Machine Coca. ‘instructions, Rie" Roak a ; Load a; Load a; muntiptg ; ada" jf Te is not difficult to see that te Syntax -directed transkation Schemes of [ARU,L-3] Corretpond in Some way te Linear, nondeteting homoncorphisms working on rabe tree Languages. Te arrive at our general notion of tree transducer, we Combing the finite tree automaton ancl the tree homonorpism inte a “tree homonorplisun wth States” or a “finite tree automaton with output ". This tree transducor Wh not ang more be an interpretation of Ts inte Th but inokues @ generalization of us Concept (although by replacing Ta by Some other Set, ik cam again be formulated as an interpretation of Ts ). Two ideas Occur in this generalization (y.2) The transkation (meaning) of a tree mag depend not onky on the translation of cts Sul treas but also on certain properties of these Sub trees Assuming hat these properties are recogni tab (that is, the Set of aU trees having the property 's in RECOG), they may be reprecentedl as States of a (deterministic) bottom-up fta Thus we can Combine the deterministic bottom-up fta anc the tree homonorphismn hy associating te each Syuabok ae By 4 mapping f: OF» Qe TK) - &3 fa ot up into two Mappings Sq [O° oO and Ay Qt > TACK). The %- functions determine a mapping § : Ts + Q ,05 for the bottom-up fla, anet the f- functions determine an Output mapping ae BG by the formula (of. the Corresponding tree homomorphism “rmela): RaLG eI) = Ky Bl), BCI) <% RG, Me Alte > - Thus our tree transducer works through the tree ina bottem-ujo fashion just Like the bottom-up fta , but at each step, & produces output by combining the output— Gees , alread, obtained from the subtrees, into one new output tree, Note that ,(f we ellows our bottom —up tree transducer te be Kondotermuinistic , then the above formule for & is Intuitivels wrong (Wwe need “deterministic Substitution” ). MW (y.3) To obtain the translation of the Input tree Oke May reac Several different transtations of each Subtrer . Suppose that one needs m different kinds of transkation of each tree (where one of them is the "Man Meaning” ave the others are “Autrary Meanings “), then these may be realized by m States of the transducer , $407 dq), dmg. The Oth translation Way the, be specified by one to tach aE Sp @ tree hy (a) € BCYme) » have Yoke = re Jastem, te gek a The ¢% transtation of a tree alt. ve may then be defined by the formula Fa lalty- = ty,C) S Yas & Ba,Cts) Dacron, tessk * Thas the eee of @ tree IS expressed in terms of 202 possible transkations of us Subtrees je Realizing Such a transtation in a bottom-up fashion Woul Mean that We Shoulel Compute all me possioke transations of each trea in parallel , whereas working in @ top-down Way we Know exacth, from tg (a) which transtetions of Which Subtress are needed (note that ,in general, not atl elements ot Yun appear in 4gCa) ). Therefore , Such a transkation seems to be realized best by a top-down tree transducer We note that the generalized Syntax—directed transRation scheme of [Agu, 193] corresponds to Such 4 top-down tree transducer working on & rude tree Language YU As already indicated in Exauph 4.1, tree trans- ducers ave of interest bo the translation of context- flee Languages (in particular the Context- free part of a progranuwing Language ). For this feason we often restrict the tree transducer te a rake tree Language, + Focal tree Language or a recognizable tree Language (the difference being tight » a projection ) This restriction is also of interest from a Singuistical point of view ; 4 natual Longuage May be described by & Context- free Set of Kernel-Sentenas to which Causformations may be applied , working en tha derivation trees (as for instance the transformation active — passive The Language then consists of alt transformations of Kernel sentences, We note that if derivation tree d, of Sentence 5, ‘s transformed into tree dy with yield 5, , ther the Sentence S, Is Sard Rr to have “deep Structure” a, and " surface Structure” Az. (4. two) Top-down ana bottom inte tree transducers. Sinee tree transducers define tree transformations (recat Definition % 10), we start by recalling Some terminology Concerning relations. We hote furst that, for ranket alphabets S and A, we shalt identify any mapping $: Ts > Ta bout the tree trans formation 45,4) | f@)=t), wd we Shalt ‘dentifig ang mapping, £2 Te > PUTA) with the tree transformation {Gib | be fb. (4) Defiration - let S,A and Q be rankel abphabets . Te M, © Text ona © TAXTR > ther, the Composition of My and Mz , denoted by Meets is the tree transformation {G,t)€ TH xTg [GweM, and (te My for Some WET, $- If Fant G ore on O* is detind as todlows. 4 For strings 5,66 O°, S Bt Uf and onky vf there exist a rule Scheme (v,w,D) in R , Strings o,, de in DOG), (xe) espectively (tohere Ky is the domain of D), and Strings & ana in Of such that Sa kVKX,E dy, Xe Pe> > B and = ALWKK eR Hy, RAKE? -B- As usued 2 denotes the trausitive-refledve Closure For convenience we shall, in what follows, use the word “rule” rather than “rede scheme” . Of Course, in & rewribing System, Wth Variables, the ranges of the Variablss Shoull be Speer fied in Some effective wag (hote that we would ie the relation => to be decidable ). Tr what follows we shale onky use the Case that the Variakles range over recognizable tree Languages (y.14) Exaupees . (1) Consider the rewriting Syste, with Variables G=(4,R), where A=fab,c} ant R Consists of He One Tule 2%,C —» Gax,bec , whee D(x,)= b*. Ther, for instance, aablee => aaabbbece (bg appAication of the Ordinary rewriting rote abbe —» aabbbcce obtained by Substituting 66 for X, th the nuke above). Tt 15 easy to See that Swe d® | abe Sw) = fotbre" | nah. (2) Consider the rewriting SyStem weth var ables G2(O,R), where Az{l,1,*, tf and & consists 2S and (x, *4] > x, , where in both rubles D(x,)= Dlxz)= 1* Tt is easy to see that , for arbitrarg uU,v,we 1”, Tuev] 2 w of the rates [xt] > [xy#%1X, if4 W is the product of wand v Gin unary notation ) (3) Tha two-fevel grammar used te describe Akh 68 way be Viewed as a remriting System with Variables . The variables (= meta notions) range over Context free Longuages , Specified by the Meta grammar Wu By specializing to trees We obtain the hotion of Gee rewriting System. (y.12) Definition. A rewriting system woth Variables G=(A,R) ts cated a tree rewriting System ‘f © As Zuf{l,]} for some ranked alphabet = ; Gi) for each rule (vj;w,D) in R | Vand W ore trees in Te(K_y) ond , for sscsh, Doy) S Te (where Kp is tha dome of D). WY Tt shouts be clear that , fora tree retoriting system G=(2f0,1},R), if SeTS and Sat, thin teTy - Tn fact , the apphication of a rule toa tYee Consists of replacing Sone piece in the middle of the tree by. Some other piece , where the Variables indicate how the subtrees of the ofA pieco Should be connected to the Ned one. AS an example , rf we have a rule ALEX, xX IbL%4T] > bik alx,dx, I] , then 36 the application of this rule toa tree t (if possible) consists of replacing a Subtree of t of a form Ko BR whore €,,t, ond tz ove th the ranges of Xj 5X2 ancl x3 Thus & is of the form xkalb(telb[tzdIIp and Is transformed inte XbCtpaltdtJ]p. (13) Example. let Sp {a}, B,-fb}, Aj- let, Bye {b} , 2px to}, =f #6} ond Qo = 4b}. (i) Consider the tree revoriting system G= (42 U§0,]},R) whore R consists 6f the rules a—»xfa] , bletxj]] - «Cblx,x,2] , and D(x) = Then , for instance , b b b * j j l b Po Pe ee “Ms a * b Yo a ok a ae Tt is easy to See that , f K is the tree homoncorphicm defined bg by(a)lza ancl , then, for S€Ts and te s% ¥(t]- from Ts to Ta 4,(6) = b(x, x4] fisyat iff (i) Considar the tree reasribing systeun G'=(Qv§C,1),R') lohere R’ Consists «1 the rules xCbLx J] + bL*LxJ*fx,]] » xa] +a, at and DU) = Te . Then, for instance, i aN ‘ UL aN 1 a Coe hUmLrrrCUL ee t too4 oa raat 1 /\ IN b bob ve Ne daaa a c. eee a o Te is easy to See that ,1f A is the homomorphisur defined above, then, for Se T= and tETA, Royse 44 x(s1 St uM The tree transducers to be definecl WU be a Generalization of the genwabited Sequential machine working on strings , which is essentially a finite automaton With output . A (nondaterministic) qenerabized Sequenbel machihe isa 6-tuple Mz (Q,2,4,8,5,F) ,where Q ts the set of States, E is the input alphabet, A the output abphabet, B is a wrapoing QxE > PLQxAT), S ts a set oF initial States and Fa Set of finak states. Tirduitivelg, “f 8(4,4) covtainc (qiw) then, in State q and Scanning input Sgurbrl a, the machine M may go Inbo State q/ and add W to the output. Fon we may define the feuchoning of M in several ways, As Brende Said, the recursive depiction (as for the fta) is too Currbersome , though is the most exact One aud Shoull be Usecl in Very formal proofs) - The other way is 40 describe the Sequence of Contigurations the machine goes Uwough during the transkation of the input String. A Configuration is usually a triple (Vv, 4,5) ,where V is the 38 ourput generated sofar, Q Is the State and s is the rest of the input. Tf S=as,, tha, the next configuration waght be (vw, q’, S,) A usefuk variation of this is to replace (V,4,5) by the string Vqs € B*QE*. The next Con figuration Can how be obtanedt by applying the String rewriting rele qa —> wa’ , tus vgas, => vwa’s, Repkacing B by a corresponding sek ef rewriting rufes, the String transtation realized by M can be defined as { ype) | do% A> Vadg for some q.€ Band ape F$ Lek us furst Consider the bottom—up generabization ot this machine to trees, Which is conceptually easier than the top-down Version, although perhaps ess interesting. The bottom-up fincte tree transducer goes through the input tee in the Same Way as the bottom —up fta , at each Step producing a piece of output to lohich the already generated output (s concatenated. The transducer arrives at a Node of ee of K output trees (one state and One output tree for each direct Subtree of the node). The Sequance of States anc the Label at the node determine (non— deterministically ) a neo State and a piece 4 output Containing the variabhes Xy,---, Xe The transducar processes the node by gol ng into the howd State. and COmputing 2 haw Output tree by sub- stituting the & output trees for Sys % in the piece of Cutput There Shoula be Start States anol Output for each node of Yank o- TH the transclucr dg arrives ak the top of th tee in a final state , then the Compubed output tree is the transformation of tha input tre. CCH. the storg in (y.2) ), Te be able to put the States of the transducar as Labels on trees we make them into Symbsts of rank 7. The Configurations of the bottom-up tree transoucar with be elements of Te (QETmI) 1, and the steps of the transducer Cinclicing the Start steps) are modelled by the application of tree rewriting rules to thase Configurations , We now give the formal definition (4.14) Definckion. A bottom-up CHimte) tree transducer is & Structure Mz (@,2,A,R, Qy), where Q is a ranked alphabet (ot states) , Such that oll elements of Q have rank 4 and no other ranks ; Sis a ranked alphabet Col input symbsts ) 5 DB ts a ranked alphabet (sf output Sumbsts ), QnCud)=¢4; Qg ts a sulset of Q (the Set of final states) ; are R is a Finite set of rules of one of the forms @) or (te): GC) a» alt] , where ae, qeQ and te Ty 5 @) aLa Py] -- deed] > alt] , whee 24, Me Se, ye LEQ ant te TEh(Xz). M is Viewed as o tYea rerdrifing Systent—over the ranked alphabet QUESUA With R as the Set of rates, Bo such thak the range of each Variable occurring in R is Th. Therefore the relations Re an 2 Ore. wel defined according to Detiution 4.710 . The tree transformation realized bu M , denoted by T(M) or stimply M, is LGtreT x] | s Balt] for some ginQy}. Y We shalt abbreviate “ fine tree transducar” bg “ee”, (4.15) Remark. Note that T(M) Is also denotest by M. Tr genoral, We Shall often Make no chistinction betwen a tree transducer and the tree trans formation & reatizes. Hopefully this Wl not Lead to Confusion, /// Y.16) Definction, The class of tree transformations realized by bottom-up ftt Wwikl be denoted by &. Au clement of B with be called a Cotton, — wy free transformation . a (4.19) Example. An example of a bottom—ups fit realizing a honomorphism Was given in Exauple y.13(c) (ib had one state x), WM (4.18) Example. Consicler the bottom-up fet Mz(,2,4,R, G4), whe Qe 64,,0,3, Z=fabj, 22 1.43, Sofab}, Qeimn}, Qunfa3 aud the ruses are i a>aglal , b+4,[6), 84 F904 ]4j eI] > acim x» ¥4¢ € 693 " => Fglmlyx2]) - ” gla (X19 el] > Fc [rO% XT], : . > yg (nlx , o The transformation realized by M may be described by Saying that, given Some input tree t, M selects Some path of even Length through + , relatels J by mm and qe bg n , and ther doubles every Subtree . for ekauple £l4a[Glabl]] May be transformect into the tree MOn[bb]n[662] corresponding to the path fe. The. fee glab] is not in the domain of M Y (4.19) Exercise. Construct bottom-up tee transelucers M, aul Mz Such that | Cpietacs) gietacey) | (ETM, )} = { (aled)"ferb, actfamh |nzo} ; (Mz, deletes, Given an input tree t, alt subtrees tof t Such that yretact’) € atbt We (4.20) Exereise. Give a recursive definition of the transformation feablzed by a bottom-up tree transducer (without using the notion of tree rewriting system). // (y.21) Exercise,. Given a bottom-up ftt M with input alphabet Z, find a Suitable E-algebra such that M may be Viewed as an interpretation of S into this Z-algebra (cf. section y. one) W 32 We now define Some Subclasses of the ctass of bottom-up, tee tans formations (4-22) Def and Keo ‘on let = be a ranted alphabet A tree t in Te(Xy) is Gnear if each element of Ee cccurs at most once in t. The tree t is Called Nnondelebing with respect to Te if each element of Kp ocurs at heast mee in +, YW (4.23) Definition let M=(Q,E 4 R,qQy) bea bottom-uze fet M is cabled Linear if the right hand side of each rule in R ts Linear M is Called ‘Ronckeleting if the right hand Seg of each rule in R's Roncleleting with respect to Ke, Where & is the rank of the input Symbet in the Loft hand Side M is culled One-state (or pure) If @ is a Singleton. M is Called (parhak) deterministic 1f () for each aES, there is at most me rule in R with Loft Rand Side a j (it) for cad, Ep1, AES Wh G,, 4p € Q there is at most one rele in R with Loft hand side 404, (%]- %CeII M ts Calbed total deterministic 'Ff (2) and (er) Role with “at most one” replaced hy “exactly one", amd Q4 = Q—— fM (4.24) Notation. The Same tertuinc logy Will be apptied to the transformations realized by such 83 transducers. Thus , for instance, a Ginear deterministre bottom-up tree tans formation 1s One that Can be real/reol bg a Linear determivastic bottom-up ftt The chasses of tree transformations obtainet by putting one or hore of the above festrictions on the bottom-up tree transducors Will be denoted by adding the Symbots LN, P, Dank Dy (standing for Crear, hondelebing , pure, deteruinstic anol total etermimstrc respectively ) to the letter B. Thus the class of Gira determimstic bottom-up tre. trantformations is denoted by LDB, Mi (4-25) Exanyle. let 3, = fe}, E,-4a,f}, 4,=$e}, A,=§a,b} anc Ay~ {f}. Considor the bottom-up fee Mz (@, 5, 4,R,Qy) , Vhare Qe Que fx} and R consists of th rules e@ > x[e] alex J] + *(40%4]] , ale I> *f4% 1), Fl" CI) — «CF lx, x,7) Ther Mé PNB Me let us make the following remarks about the Concapts defined in Definction Y.23 (y.26) Remarks - (1) Deletion ts different from erasing. Ar.tule may be calbe Lrasing It ts righthand Side belongs to Q(X] Thus, Symbets of rank o Cannot be erased. Syubots of yank 4 Con be erased without any deletion, bat Symbats %y Can only be erased by deleting 2M&o K-7 subtrees. Thus a nondeletag tee transducer 's sbLL ab to Crase Symbots of rank 7 @) of rank 22 The one-State bottom-up tree transformations Correspone Intuctively to the finite Substitutions tn the String Case. (3) The tolab deterministic bottom-up ftt realize tee transformations Wwhich are total fuuctions Ms (4-23) Exercise. Show that, in the definetion of "(parbal) deterministic” , we may replace the phrate rae most one” by “exactly one” Wwethout changing he Corresponding Class DE of cleterunnishe Gotten -up tree transformations lM Di the next theorem we show that ath (ehabelings , Pirite Wee automaton Mappings and tea fomromonphisins ave fealizable by bottom-up fee (Y.28) Theorem (7) REL © PNLB (2 GIA Cc Vee G) Hom = PDB , #a LHOM= PLO,B. Proof . ) let r be & felabeling from Te ‘nto Tg . Thus 7 is deteruinact by @ fants of mappings Me: Ze (Le) Obviously the following bottom-up ftt realizes r Mz ({x}, 2, A, R,ix}), where R it Constructed as follows > Bs (for 26%, 1f be nla), then aselb] isin; (i) for £37 and acSye, if be my (ay the, aLelx,]--# [gl] > * (hx 1) Cletrts Me PNLB (2). From the defrntion of FTA anck from Part 3 4 follows that we need only consider a deternanistic bottom-up fta is in R M=(Q,2,S8, 3, F) and Show that TOM) = (t,t) [te LOM)) is reabized by a bottom-up ftt Consider the bottom-up fet M=(@,5,5,8,F), where R Is Constructed as follows : @) fraes, , asrgqlazisinR, where qI-& (it) for het, and 265%, 1t O8la,,.a)ed » Gen 204,04]. 9¢ 0%] > QlalX 1) Ws mR. Cleart MK realizes T(M) and Me NLDB (He deteruinism of M follows from that of M), G). We frist Show that Hom ¢ PDB (ant [Hom C PLD, B) An example of this Was alreark, given in Exauph y.13 (1), let R bea tree homo twerphism from Te into Ty determined by the Mappings he: By > Ty ER). Consider the bottom-up ftt Mz (1#},2,4,R, fxh) , where R contains the following rakes : () fr aes, a> x4, (a)] is in R (i) for €37 and aeSg , the rule ACKX IJ # (ye I] > * (Rew) ts R Obviously M isin PDB (and Zinear it Kis Anear), let us prove that M reatizes KR. Thus we have to 2 Show that , for SéTs and tE Tp , A(sj=t tf s% xe). 86 The. proof is by induction on S, The Case SEZ, is clear Now &t s- als, Sg]. Suppose that A(s)=t . Then, by definition of k, tetelaycrx, By induction , S; tS e[R(s)] forall, res. Hence Chak nobe that formaths this reads @ proof ) as, 5] 2 alx [Res]. «(Rcsg)1] But, by rule (ic) above, a [afis,].--*(ACRI] = XTRA), RCRD] Consequently s% x(t). Row Suppose that S=afs-se] 2 xf]. Then (anc again this needs a forwal proof) there are Teas ty te Such that 5, A> x[t-] fer asts# and als --Se] 45 24061141) > eR OR by ee >] = «[t] By induction , ty=Ris;) for ahh i, revs. Hence ta Rec, — KS), OREO = 4s). this proves that (LIHOM © PCL) YB. To Show the Converse , Consider & Che-State total determinzt'e bottom-up tree transducer Mz({x},2,4,R, {x}) Define the trea Romomorphicnr A fron TS inte TA as follows . W) for 26%, 4, (a) is the tree t occurring in the (uvique) hele A et ip Wy for Rpt ant ae te, Rela) & the eat Occurring in the (Wuigue) rele aCK(y]-.#(%e 1] 3 *[t] in RR. Then, obvicusts $y the Same proof as above, i= T(M) MW (4.24) Exercise . Prove that the domain of @ bottom-up tree tausformation Is a recognizabte of tree Leuguage, and Vice versa Wt let us now Consider the top-down generalization of the. genorabized sequential machine The top-down finite tree transducer goes through the input tree in the Same Way as the top-down fta , ct each Step producing a piece Of Cutput to which the (unprocessed) rest of the input is Concatenated. Note Uerefore that the transducer not reall, "goes through” the input tee in the Same way as the bottom-up fet does, Since in the top-down case the fest of the Mmput way be modified (deleted, permuted, Copiad) Auring Qrans- Labion, whereas in the bottom-up case the rest of the input is Unneodi fred during tranckation. The top-down trans— ducer arrives ata Node of rank k In a Cartan State on thar moment the Configuration is an clement of TH (QL 7) whore Zant B ave the input ark output Alphabet , uk Q the Set of States, The State ant He Libel at the rode determine Cnoncteteruunisticely) & piece of output Containing the Variables X4)-., XR , And States With which be Continue the translation =f He Subtrees. These states are aio Specified in the piece of output, which is in fack a tree in Te (QLxg]) , whare an occurrence of Q[x-] means that the processing of the i subtree Shoutel at this poiut, be continuscl in State q. The ¢ransducer processes te rode by rephacing Ub au es crect subtrees by the plece of output, in which the K Sublrees are Substituted for the variables Xyrvey Xe» The processing of (all copies of) the Subtrees 88 1s Cortinuad as indicated akhove The transduucr Starts at the root cf Ge input tree in Some inctiak state There Should be final states and output for each node of rank o. Tf the transducer arrives in & final State at each teat, then it replaces each Leaf by the Final output , and the fesubting tree is He transformation of the ityut tree. (Cf the story in (4.3) ). “The Steps of the transducer, inckuding the final steps are modelled by the apphication of rewriting rubes ty the elements of Ty (QLTS]) We now give the formal definction. (y.30) Definition . A top-down (fincte) tree transducar is @ Structure Mz (Q,%,A,R,Q4), “here Q,Z ant O are 25 for the bottom-up fet , Qa is & Subset 51 Q (the Set of initial States), and Ris a fine set of rubles of one of the forms (4) or (ee): © alalx,-.xgT] > +, where fot, ae B, a6Q ana te G(Q(K]) ; Go aad st » whe GeQ,aee, ond te Ta. M Is Viewed a5 a tree rewriting Systew ever the. ranted ablphabee QUEUA with R as the Set of rules, Such that the vange of each variable in K is Ts The ee transformation realized fy M , denoted by TCM) or Sinephg, MS) { Gite Ts xg | afst ot for some g in Quah. / (4.31) Definction. The class of tree transformations Feabized by top-down ftt with be denoted a, T. es An element 6] To wll be called a top-down tree bans formation Wp (y-32) Example . An exauple of 2 bop-down ftt realizing a Romomorphism was giver in Exanple 4.73 (cc) (HE hal one stete *). Ml The next eXanple is a top-down ftt computing the formal derivative of 2% arithmetic expression . (4.33) Example, Consider the top-down ftt M=(Q,2,4,R, Qa), where FZ = fab}, A= fa,bo7}, 2,24, =4-, sin, ces}, Ze=Ageits*}, Q={a.c}, Qa=fa}, the mates for q are qa(t+l%yx.I] ~~ +lalk,14by27] QCHlXy xe] > +[¥lale te le,3) #04 [x2I11, at-(%] —> -Cam&l] 7. aCsintxya] — #Ceoslelx,] 40x72], qfeest%3] — *[-[snlelx]] al], » afb] +e Ond the rules for 6 are CCHIy I] — + £¢[x, Ie 0x 1] OLeCxe I) —» * [ci ied ~ ) ef- Tx) —Cety,7, ¢ Csin [x3 stn Ce [x43 v [eosOyIJ SEEK TI , in) oer and Cfejy sb. Ther (4,5) € TCM) iff S is the formaQ derivative of + Unth respect to a. Fer instance, ened ge a[e[+fobi-te1)] 2S +e f+ le] fle t+feb]- £771) . Note that i[t,] +t, iff tat. (4,673) MW (4.34) Exercise , let 2 =hab}, Sf} and 22-{A,V3 Ts. may be viewed as the set of aid bovkean expressions over the bookean variables a ana b, Using negation, Conjunction and Aisjunction. Write a top-down tree transducer which transforms every boolean expression into 2 equivalent one in Which a and 6 are the onty Subexpressions which may be negated . (4.35) Exercise , Give & lecursive defintion of the trans formation vealiredt by 2 top-down ftt Find a Sutible S-algebra such that the top-down ftt may be Viewed as an interpretation of S inte tis S- abgebra. We AS in the bottom up Case Wwe clefina Some Subclasses 51 T. (4-36) Defintim . let Mz (Q,2a,R,Q,) Se & top-clown tree transducar. The definctions of Linear , ronceleting arc Ohe-State are iclentical te the bottom up ous ( De fintio, 4. 23). M is called (partixt} deterministic ‘t ) Qa is & Singheton ; Gi) for each GE Q, Kat, and aE, the ic at Most one vule in R with Loft hard Side 9(A[X)---%I] | ae ii) for each GEQ awd 2E SZ there is et most one rufa in Ro with Loft hand Side gla]. M is Cutled totel deterministic §(f (4), Kt) ann (éeé) Fatt With "ab most one” rephrted by “exactly one” jy Notation 4.24 abso apphies to the top-down Case Thas , PLT is the Chass 07 one—state Finear bop_detcn tree transformations . (4 84) Exauple . lee 3 = fe}, E,- haf}. d,-te}. QO, = fab} and Dp Lf}. Consider the bop-clown tree transducer Me (Q,2, A R,Qy) with Q=Qys (¥} and R Consists of the rutes CECI] > Sle tx de x3] , * (eck) 3 ale(x7] , *fetx 2) = ble oxy) ¥*[e] ve. then Me PNT. Ml Remarks 4.26 abso Glblxy.xe1] wits DB, HED, AEB aud LEDAg , OF Ff the form ar4lb] wth ge@,ae% ana bed). LY Tt is clear that the classes 61 top-down anol bottom-up finde State relebelings Coincide . This Class Will be denoted by QREL. The classes OL Aeterministic top-doum dine oleterunimistic Sottoni-up gs fincke state relehebings obvious by do not Cojncice Thaey Wht be denoted by DTQREL and DBQREL respectively Note that FTA UREL © QREL © NUB AALT. Apart frou, the tree traosformati realirect by & tree. transducer we Wut also be interested ii the image of & lecognitable tree Rarguage under a tree Grans formation ana the yield of Chat image. (4-47) Definkion, let K be a class of tree transformations. tn K-sucface tee Aanguage, foe Longuage MOL) with MeXK and Le RECOG. An Ka tamet Cgquage is the pelt of ay Ko Surface Rovguage Ani Xi~transkation is « String relation { (Yields), wielh(t)) | (NEM and SEL} for Some MEX aul Le RECoG The classes of B-surface ano K- target Ranguages WAR be denote by K-Surface anc K- Tayoek eae WA Tt is Cear that , for alu chasses KO ctiscussed Sofar, Sine the Identity transformation is in K RECOG SK Surface )and So GFL ES X- Target . Moreover is hear fron the prowg, o4 Theorem 3.6y that (f HOM CX then te aboe inclusions are Proper aY (y. three) Compacison of B and T , the nondeterministic case . The main Aifferences between the bottom-up anol the top-doun tree transducr are the following , Property (B). Nondeterminisin followed by copying . A bottom-up ftt has the abibity of furst processing an input subtree nondeterministically anc then Copying the resulting cutput tree. Property (T). Copying followed by kifferent processing (4y rendeterminisin or by different States ) A top-down ftt has the ability of first copying an input Subtree ano then treating the resutting Copies Af ferently Property Gee): Ghecking followed by deletion. A bottom-up Fit has the ability of furst processing an input Subtree. and thar deleting the resulting outprt subtree, Th other words , depending naw Cecogni table) property of the input Subtree , tt can decide whether to delete the output Subtree or obo Something else with tt, Lt Should be intuxtively clear that top-down pee do net possess properties (B) and (8) , whereas bottom-up ftt do not have property (TJ, We now show that these Aifferences abso fesult in differences gs in the Corresponding classes tree transformations . eo. (4.42) (on, For any alphabet ZB, rot containing the brackets [ and J, we define a fuuction me eS (Ss Oe ae 4s fellows we ar fr ae ana > M(aj=za and m(aw)= alm]. Fer instance, m(aab)= afalb]] Note thet m is a lank of Converse to the mapping fia Uscused after Definition 2-27. A tree of the forty mw) ll Wilh abso be called a Mmonade tree (4.43) Theorem. The Classes of bottom-up and top-down tree transfermations are incomparable, In Particular, there cre tree transformations in RNB le ant ON () Consider the bottom-up ftt M of Example y.25 M is in PNB aud is @ typical exauple of an Ltt having property (8), Tt is intuitively clear that M is not realizable by a top-down ftt . Tn fact, Consider for each N24 the Speafic input tree $[mlare)]. This tree is nondeterministically trans- forued by Minto all trees of the form f[m(we)m(wey] where W Is &@ String over {a,6} of length n. Suppose that the top-down fet N= (QE 4, R4 Qj) could do the Same transformation of these input trees. Then, roughly Spealting, N would first have to make & copy of mide) and Would then have to relabel the two Copies in an arbitrary but identinl Wag, which is cleats impossible. A formal proof goes as follows. Tf N realizes the Same trousformation, then, for each N24 and eau, weiab3* of dength n, there is a derivation do CELmcael] 2 F Lm weymcwey) for sone a, in Q] le& us Consider a Fixed nr Consider , in each of these 20 derivations i , the frst String of the form ft t,] ; that 1s, consider the moment that f is produwd as Output . Note that this is not necessarily the Seconct String of the derivation , since the trausduar mag first erase the input Symbol J and Some of Che a's be fore. Producing amy output (this the derivation may Rook bke a, lf lmla'ey]] 4> qlatm(ater]] > flit.) % fhnvermwey for some qeQ! and some &, os ken, or even Like aC imiate1] SS ale] = FLtt2] = Flmqmermiwey] for Some GeQ ), Obviously, for different derivations these strings have to be afferent : 'f , for wAw’, both & “ ma(we), ty S mwe) and £, S m(we) , tr Smwe, then atso f[t,t.] 25 f[mwe)miwey] , which is an Invalid Output. Thevefore thare are 2" of Such Strings Flt, 7. However & is clear that fltjt,] is of the form FLEE] , whore os tsn and FlEQI] ts the right hand Side of a ruke in R’. Therefore the number of possible f[t,t,] ‘s ts ess than (ntayr wohure r= #(RY. For n Sufficenthy torge this is a Contradiction, 2 (2) Consider now the top-down te M of Example 4.33. M is in PNT ancl is a typical example cL an Ftt having property (T). Suppose that M Cin, be reatized by a bottom-up fet N=(QiS/4/R/ Qj). IF Consider again for each nz1 the Specific input tree flm(ate)]. This tree should be transformed by N into all trees of the form f[m(u,e)m(w,e)] for ly, w, € fa,b}* of Length n. Let us Consider , in each of the derivations realizing this transformation, the frist String which contains the Output tree , Note that this is not necessarily the Last string Since N may end its Computation by erasing a number of a's and the input ¢ Dbviousty, this string is obtained from the previous one. by Application of a rule with right hand Side of the form AC flt, G1] , whee a€Q’ , FF € Ta(tx}) and thee are S, and Sz, Such that t,<% S,> = M(wye) and = m(wye) Obvioust,, if FIFE Contains ho x, or ony one Xx, , then the Pule Con only be used for exacthy one input tree f[m(are)] Thes we may choose n Such that in all derivations Starting woth f[m(atey] the right hand Side Qlfle % J) Contains two x,'s (Ut Cannot Contarn more), Thus ALELEE, J] is of the form gl $fmiy,x,)m(vy,x,)]] fer Certain Vv, ,V, € {a,b3". By choosing n Larger than the fength of all Such W's and vy's Occurring Jin right hand Sides of rules in KR’, we see that the output tree always has two equal subtrees ze u& has to be of the form f[m(v,we)m(ywe)] fer Some wea b3t. Thus, for Such ann, hot alt possible outputs ave produced. This is @ Contradiction, Mi fa important property of a class F of tre transformations is whetter it i's closed Under Compo- ge sition. Tf so, then we row that each Sequence of transformations from F can be realized by one tree transducer (corresponding bo the class F), We then alse know that the class of F-surface tree Languages closeol under the transformations of F. The next theoren Shows that unfortunatel, the classes of bop-clown and bottom-up tre transformations are not oseal, under Composition. This noncBosure 1s Caused by the failure of property (B) for top-down (ansformations (property (T) for bottom—up transformations ). (yyy) Theorem, T ana B are not hosed uncler Composition . Tn particular , there are tree Crans formations in (RELe HOM) — T and in (HOM+REL)- 8 Foot. (1) The bottom-up ftt M of Example ¥.25 Can be realized by the Composition of a relabeling and a Rouomorphism . Let Sey z{e} and 2,-fa,6 ff let r be the relabeling from 2 inte 2 cle fined 4, Ble= heh, rylajefab} and m(f)2ff). Lee h be He tee home - morphism from S2 into A defined by fa le)ze , hlay2 ale], &,Cb)= ble] ond 4, (A): fI4%] - Then, for at séTy ana bE TR (stem tf there exists U in Tr such that WEr(s) ant K(ujet Thus, by the frist part of the proof of Theorems ¥.Y3, M is in (REL-Hom)—T. (2) The top-clown fet of Example y.37 can be realized by the Composition of a Fomomorphiim and wee opgteip are I9 a rehebeling let TT be the ranked alphabet uxt, T= te}, 1, = {4} and The [f} Let & be the tree homuoreorphitin from = itty TT defined by hylejne, h,alsaly] and Rf)- flex] let br be the relabeling from TT into A defined $9 Wele)= fe. r(a)={4,6} and 2(fl=bF}. Then, Pratt SETS andl tet , (StHEeM If there exists w in r R(s)2u anc te rca) Tr Such thad Thas, by the Second part F the proof J Theorem y.3, la (hem eer) fe “M CY-9s) Exercise. Prove the Statements in the above proof / One might get the impression that each bottou —epe (resp. top-cown) tree transformation Can be realize bg tue top-down Cresp. bottom tp) tree transducers (ie. BoT-T, resp. TC B-B ) We shall show Later that this is true. let us now Consicter the Gnear Case. Since properties (B) and (T) are now Cbiminated , the only femaining difference betwee, “near top-down aud bette ap tree transducers 1s Caused by property (BY). (4.46) Lemme. There isa tree transformation M that belongs to LDB, but not to T. M cur be reabizet by the Composston ef a deter nin stre—top-down fta with « Grear homomorphism . Froug.. let Hafch, Z-{b}, B,-1aj, A-fe} and A,= 4a,b}. Consider the tree trans formasion 400 Ms [@[te], a[t]) | t= m(b"c) for Some nzo} We shall show that M¢T. The rest of the proof is Lett as an exercise . Suppose that there is a top-down fet N=(Q,Z/ 0’ R,Qyg) Such that T(M)=M. Each Successfull derivation of N has to Start with the appljcation of a rule g,lalx,%]] —~ S , whee aeQ4 and Se Ty (Xz). Now, if 5 Contans ro x,, then we could change tha input alte] inte alt’c] without changing the output | Tf S contams ro x, , then we Could change aftc] into A[tb[c]] and still obtain (the same) output . But if S contains both x, anc Kk, , then has to contain a Symbol $f rank 2, and So aft] Cannot be derived “i Sinte both deterministic top-down fta and Linear Fomomorphisms belong to LDT we can stete the following corellary (4-47) @rollarg . Coneposition of Chear deterministic top-ceun tree transformations teads out of the class of top-down tree transformations (in a formula ; (LoT-LoT)-T 4 ¢ ). Ml We now Show that, in Some Sense, Property (B') ic the ity Cause of cifperence betwear Ainear bolton po and Crear top-down tree Craus formations. Fistly , att finear top-down tree transformations Can be reabjeed Linear botton-up . Secondly, in the rendeleting Gnear Case, AU Aifferenes betwear top-down and bottom up are gone ( this Can be considered as a generalization ne 404 of Theorem 3.17 ) (4-48) Theorem, a LT ¢ Lé ( NLT = MLB Proof . We frrst show part (2). let us Say that & nendeleting Linear bottom-up ftt M=(@,2 A RQy) and a Nendeleting Linaar top-down Pt Na (QE) ORR) te “associated” rf Q=Q’, S23", AZM ,Qy= Qi an O) for each AEB, QEQ ana men Gockel 6 ff Ge te ee Gi) for each Rat, 2e Be, G,,.-,42,4 €Q ane be Ty (Ky) Gncar and noncleleting wrt. Ke , 204, fy). ag0a]] + Qlt] is in Ro rte Q[aly,-- eI] > t<%<4,[K,1, ts in R’, 2 RE Ig EID Note that each tree re Th (QUE 1), which is Linear and nondeleting wrt. Ky , is of tha form ESx 4,04], +, KR OIMEID , Whore te T (Ke) is Rinzar ant Nonecsleting wrt. Ke (in fact, t is the resudt of replacing qy[x:] by x; in rv), There- fore ub is Clear that for each Me ALB thare exists an associated Ne NLT anol Wee veysa. Hence Suffies to prove that associated ftt realize the Same free transformation, Lee M aud N te associated as ahove. We shall prove jy induckyon on s, that for erg 4EQ, SETS and UET , c#) s 2 atu} tf ais] Su. (0 ernie rn SoS) 402 For sé » ©) Is obvious, Suppose now that S=afe ‘A "P ls, %I for Some (ee Ge sp an 6 sp e7Ss The Onky 17 part of (#) is Loft ty He reader (it i< Similar bo the proof of Theorem ¥.28 (3) ). The if'-part of (WH) is proved as follows (4 is Simuitar to the proof ef Theorem 3.65) let the fist rue applied in the olerivation g[als,-. 417 Su be Qlaln,..xe]] + rj and Let rot Bu. Binee € is Binnar anc nondeleting , Hore exist U,,. Me ET Such that Us Ecx,eu,, .., eeUg> anol |G, [5;] = a, for ath ¢, asck . Hence, by induction, 5; > a, [4%] frat i, 91s k, Also, by Assocatedaoss, the rite 264, D4] age] > aft] fs in R, Consequently , Als] Ae O04, [u,] aglgl] = Altcmeuy ss meug >] = q[ul. We now show part (1). By Lemma Y.46 , tb suffices fo show that LT SLB. Tn prinapee we con use te Construction used abou to showy NLT © NLB. The onk, prolen is that the top-down transducer N may delete Subtrees , whereas a bottom-up transducor i's forced to process a subtree before deleting 6 The Soktion 1s to add an “identity State” ol t the Set of States of M Whih elbows MM to process dy Subtree which has to be deleted (A is such that, for att €€Ts, 4 ae] ). The formel construction "sas follows, Let N=(Q,S,4,R,@y) be a Linear top-down ftt . Construct the Linear bottom-up ptt M=(Qufdt, 2, AUZ, Ry, Qy) , where Ry is 403 obtarnect as fellows . (2) for each; A2€2, the rule a2» fa} ic in Ru, anc for each £24 ancl aes the mle alalx,]...40%I] ~~» ALalx, ... xg} t's in Ruy for g€Q,aeF, al teh , Sf qlazywt os in R, thar 2» Qt] isin Ry. lee qlalx,..3] ~t te in R, whore 4eQ, 234, 2€By one t ts a Rnear tree in TCQCXg1). Determine the (unique) States q,, ey) (3) nde € Quid3 Such that , for tecs€, either g.[x,) occucs in & or (X¢ does not occur in € and) goad. Determine te Th (Ke) Such that Ue x ea, Te], vy Kee gelxeT > = F. Then the rule ala, [XJ dee I] —s QLt] fs in Ry. Again Gt) Can be proved 1 Ond Sine the proof Orta Kightls differs frou the previous one , Ut is Laft b the reader W (4.49) Exercise Find an exanpee of a tee transformation in LD.B— T, M (4.50) Exercise. Compare the classes PLT and PLB. My (4.51) Exercise . Let a deterministic top-down fee be called “Simple” if i is not allowed to make Afferent tranctations of tle Same input Subtree CF alaly.xg]] +t is a rule ancl 9, fx], 4.06] bccur in t, ther q,=@,). Prove that the class on of Simphe deterministic top-cown tree trans formations 'S inckuded in B. (Ths resutt should be expected from the fact that property CT) 1s eliminated. Stuilarky , oe Can prove that NDB CT , because propertres (8) and (8) are Lininated. ) fl (y. four) Decomposition and Composition of bottom- tree_trans formations. Since bottom-up tree transformations are theoretically easier to handle thay top-down tree trans for- mations , We Start inuestigeting the former. We have Seen that o bottouup Ftt can Copy after honclebermanistic processing (Property (B)) . Tha next theomen, Shows that these two things Can in fact be taken apart into Ai Frerent phases of the transformation : each bottom-up Ftt can be clecomposed into two transducers, the first doing the hondatermunism (Binearhe ) aud the Second doing the Copying (deterministically ) , (4.52) Theorem. Each bottom-up tree transformation Can be reabized by < Fimte state relabeling followed by a thomomorphitin. In formuta : B < QREL - Hom Moreeover LB © QREL + LHOM DB ¢ DBQREL - HoM Proof. Lek Me(Q,EA,R,Qy) be a bottom-up ftt. To Sitautate M in two phases we apply a technique and 405 Simifer to the one Used in the proof of Theorem 3.59 : a fine state releheling is used to put information on each node indicating by which piece of tree the node should be replaced ; ther a homomorphiin is used. to teplace each node by that piece of tree. The forme Construction is as follows We simubtencously Construct a ranked alphabet 2, the Set of rules Ry of « bottom-up fet Nz (QE,2,Ry, Qy and a homonorplin £ » To 2% 4s Follows C) Tf a>a[t] is a rate inR | ther aA isa (reve) Sgunbol in 2, , A» ley] is in Ry and Ald)st @ Tt 44, by]. %0el] + alt] is a rule in RR then dy is a Crew) symlood in 2p , 24,0] ---a¢l¢T] > 20 by & in Ry and Fylde t. The Onty requirement on the Somrbsts of 12 & Hat if tx by tha de gly. Obvioushy N is 4 Cbotton up) finite State felabeling Also, 1f M ts Pinear then Ris Linear , and ‘TM is deterministic ther So is N. Tt can, easily be shown Cby induction on s) that, for S€TS, EQ and teG , = > att] ff duet): sSqtul and Alujy= t- From this follows that M.Nok , which proves the (4.53) Example Consider the botton-up fet M of Exauple y.79 Tt can be decomposed as follows Firstly, 2, = {a,b} and Qa=4m,, my, 2,12} , where 106 ayaa, ab, Ante, X =) hapyyy = Me, npyyy = y and niger = 2 Becondty , N=(Q,E, 2, Ry, Qa), where Ky consists of the reles 4249, la] , 654,66], 44,04 ]4j 041) > 4,0, 21] * 2 Dey [ee Peed gla, [lay C21] > Fac Lay L423] > day lm II for alt é,y €fo,73. Finally , “2 is defined bg Alajea, bolbleb, Alm) = mlgx,], Ag (ma) = mU%2%I, Fa ly)= NDy%,] ant Kindo nl%exzI « For exemple, flalglob]]] % a [mla[mlab]]]] an Rem, [alm fabI) = mEn[bbIn[bbI] . Wa Note that Theorem y.52 Means (among other things) that each bottom-up tree transformation Can be realized by the Composition of two top-down tree transformations (ef. Theorem 4.43). We now show that, in the hondeterministic case , the finte state telabeling can st0l be decomposed further inbo a Vebabeling followed bya firite tee automaton mapping (4.54) Theorem . B CS REL+ FTA + HOM and LB S REL + ETA > LHoM Proot. By the previous thenrem and the fact that HOM ana LHoM Gre closed Under Composition ( Se Exercise yg) i clearly suffices to Show that QREL & REL FTA> LHoM. Let Mi@,2,4,, Q4) be a bottom-up fine stute relabeling . We shall top actually show that M can be Simulated by a rHabeling , followect by an fta, followed by a projection (which is in LHOM). The relabeling guesses tohich rube is appliecl by M at each node (and puts that rule as a Label on the bode) the bottom-up fta checks whethar this guess is in accorchance wet, the possibe state transitrons of M, aut finally the projection Labels the rode with the right Label Formally Wwe construct a ranked aphalet S2 , « relabeling r from Ts into Ty , & (nondeterminastic) bottom-up fta N= (Q,2,3,5,Q4) and a projection p from Te into Ta 45 follows. G@) TH rete min R is ef the fom a >@(b], then dy, is a (nwo) symbol in 2, , dy € (4a), Q€ Sa, and Poldy)= b. Cy TA rule m in R ts of the form ala, Ck) ---aelxel] > glbly..%]] , then dy, is < (nia) Sgumbob in 2p) dm € C4)» EBL (Ay, --g) and Pl(du) = b- We require that if m and n are Afferent rules , then dy # Ay: TR is Lett t the reader to show that, for SéTs , ZeQ and teTA , s = atta ff FueTQ: vers), ueL(N) and t=plu). this proves the theorem. iM These decomposition results are often very helpfull When proving Jouething about bottom-up tree trans for— mations > the proof can often be Split up inte 108 proofs about REL, FTA ana HOM only. AS on example , we Immediately have the following result fron Theorem 9. 5y and Theorems 3.32, 3.49 and 3.65 (note that a cLass of tree Ranguages is closed under fta Mappings If and onky cf is Chesed Under Intersection with recognizable tree Languages! ) . (4.55) Gorellang. RECOEG is closed Under Linear bottom-up tree transformations . fl’ (This expresses that the image of a recognitahh tree Language under a Linear bottoy-upo tree trans- formation is agen recognizable _ Tn other words, LB-Surface = RECOG. ) (y.56) Exercise. Prove , using Theorem Sy, thet B- Surface = HOM-Sturface. Prove that, in fact, each B- Surface tree Language 1s the homomomtre tmage of a rule tree Language . Ww We now prove that under Certan Graunstances the Concposi hon, of two elements of B is again in B Recalt from Section ¥-three that the non-closure of B uncer Composition was Caused by the fare of property, (7) ee: in general , in B, Wwe Can't Orpose a Copying transduce With a neondeterministc, one. We now Show that if ether the furst transducer is non copging or the Second One is deterministre , then their composition is again in B. Thus, when incnating Cha Fortune of) property (T), closure results are 40g obtarnod (y.5}) Theoreny (a) CG Gc Gnd OG ee Ce Ce @ 6-DB <¢ B and DE-DB < DB Proog, Because of our decomposition of B we ony need to lek at Special cases, These ae treated in Gree Lemmas , Concerning Omposition with homoror — phisms., Fla mappings and relebelings respectively Detailed induction proofs oF these teumas are “eft t Ge reader. leuma. 8. HOM SC B » “B+ LHom © LB and BB - HOM < DB. Proog, lat Mz(Q,E A,B Q4) be % bottom-up ft Gud hoa tree homomorphism from Ty nto Ta. We have ‘to Show that Moh Can be reatjred by a bottom up ftt N. The idea is the Ste as that of Theorem 3.65 » N Simutetes M but Outputs 2& each Step the Romomorptic Image 64 the “tput of M. Noles that | Contrary to He proof of Theoren, 365 (whith Was Concerned wett, regular tree Grammars ,a top-cloum device), we need not require Linearity of KR. The Construction is 25 Follows. Extend A to trees th TACK) by defining Ay (x)= x; for all x in EK. Thus R is now a homomorphism from Ty (K) into Ty (X). Define Nz (Q,%,-2,Ry, Qa) Such that «) tf a@>@lt] is in R, then a> alh(t)] is in Ry 2 do wy tf afa,[x1--- dele > alt] isinR, then al4, 0x9 ---4e DRI) > GLACE is in Ry - Obviously, f Mand fe are Pinaar then so ts N CHa Linear tromoncorphisn, transforms a Binear tree tn Ty(X) into @ Linear We in Ty (X)), ane if M ts deterministic then So is N MM lemma. B-FTA © B , LBeFTA © LB and DPB-FTA © DB. Proof. The tdea of the proof is Simitar bo the ore Used to Solve Exercise 3.68. Lee Mz (QEZA,RQ,) be a bottom-up ftt and fet Ge ea be a deterministic bottom-up tt corresponding to a deteruunistic bottom-up fta as in the proof of Theorem 4.28 (2). We have te show that MK) can be realized by a bottom-up fit K. K wile Rave Qx Quy as bs Set of states aud i will Simulta- neously Sinrutate M and keep track of Che state ot Ni at the Computest cutpe of M. Extent N by extending its alphetet to AUK (or ,better, to Du X,, where m is the Aighast Sub- Script ef a Variable curring in the rules of M ) ana by allowing the variables in tks rules to range over Ty (KX). Thus tha Computation of the finite “trea automaton MN Wag now be Started with an clement et Tm (QLX]), which means that at certain paces jin the tree, W) fas to start in prescribed start States. Construct K=(QxQy, =, &, Re. Qa x Quy) Such thet cc) if a+ Q(t} is in R and rt + . then a (a,a’)[t] isin Ry; cif afa,tx7- than the rule a[la,a7) D4] 4%] > atl "fF t 114 Sgt], i is in Ry ane 2 vt, Ni (ag. ag)OxeT] + (4. Q CE] is in Re. Note that 1f M is kinsar, then % Is K. Moreover, Since WI ts dokerwinctre , * M is deterministre then sois Kk. Ml lewma. Be DBQREL C B , DB + DBQREL C DB and LB REL C LB. Proof The proofs of the first tus Statements tre easy generadjzations of the proof of the previous Lewna is Lett We Furstlg l@e-Be aS an easy exercise, LB © REL» FTA» HOM S lB-FTA. HOM ¢ LB - Hom <¢ B aud Simikerey for LEolLB © LB, Secondhg 6-DB <¢ B. DBQREL > HoM ¢ 8. Hom | and Similarly for DB» DB © DB. wer Note that the “right hand Side” The proof of the third Statement We hou finish the proed of Theorem 4.57. (Thm. y.sy) (34 ferme) (24 Lemma) (1 Lome ) (Thm. 4.82) (34 fewma) (1% hewma) M/ of Theorem y. 57 792 states that LB anol DB ove Losed Under Composition, (y.58) Exercise , Is LDE closed under composition? /// (5g) Exercise, Prove that B-Surface is cboseal Under intersection With recognizable trea Languages. // A consequence of Theorem 4.57 (or in fact its Second 4emma) (s the following useful fact (y.6e) Corollary . RECOG is Chesed Under inverse bottom-up tree transformations (in partitular under inverse homomorphisms , cf. Exercise 3.68) . Proog, Let Le RECOG and MeB. We hwe fo show that M-7(L) is in RECOG. Obviously, 1f Ris the finke tree automaton mapping {(t,t)|] tel}, then M>1CL) = dom(MeR). By Theo y. 57 M-R is in B , and So, by Exercise y2g, Yes domain is in RECO@. a (y.62) Remark. Note that, Since, for arbitrary tree transformations M, and My, , (MoM, 7 = Mz. My7, Gorollary y.b0 tmpties thaé *f M ts the Composition of dug finde Muunber of bottom-up tree transformations (in particutar , elements of KEL, FTA and HOM), then RECOG 15 Closed uncer M~7 ; woreover, Since dom(M) = M-*C TA) , the domain of duu, Such tree transformation M 11 recognizable. i) | 173 We Finally note that , because of the Composi tion results of Theorem y. Sz, the inctusion Signs in Theorems 4.52. Gin Y. Sy may be replaced by equality Sans. Tt ts abse easy bo prove equations Like Be 66. Hom | 6 Ceo be a ee, DB (Compare tuith properhy (&) ), The frist eguefion has Some Wwportance of rks oWm, Sine it characterizes E uethout referring bo the hotion “bottom-up” » Gis is So because 6B has Such 4 choracterization as shown hart we frst need a clefintion aE (4-62) Definition. Let F be a class of tree trans formations Containing alt identity transformations, For each mp4 we define F” induckivels bg ee Moreover, the CLosure of FF uncler Composition , denoted 65 F* os detinad to be Y a This F* consists of all tree transformations of the form M,.M,. .. oM, , Where U27 and My €F. for ale i, recen. i y.63) GoroLhary : (1) LB = (REL U FTA U LioM)* (2) Bs ee om. Prog, The inclusions & follow from Theorems 52 ana y.sy ; the inchusions D> follow frou, Theorem y. 53 WY : ty (y. five) Decomposition» down tree transformations We now show that , analogously to the bottom up Case, We Con ckecompose each trp-clown Ftt. mto two transducers, the frst doing the Copying and the Second doing the rest of the Work (cf. property (T)) (4.64) Theorenr Tc HoM- LT and Digs Hom. EDa: Proof Let Mz(Q,=4,R,Qy) be a top-down ftt Whi processing an input tree, M genarctly mates a Lot of Copies of input cubtrees in order to get Afferent transkations of these Subtres. To Simulate M We Can first use 2 Romonorphism which Siuply makes aS Wang Copies of Subtrees as are needle ty M, aul ther we can Sinmnbete M Linearl, (Since ate Copies are already, there). The formal construch>n i's as follows. We furst cleterwine the “degree of copying” of M, thet is the Maninel member of Copies needed by 7) in ang Step of ths Computation . Theas, for xe KX ancl reR , L& ty be the number of occurrences ef X th the right hand Side of r. Lee n= maxi, | xeK,reR} We now &t 52 be the ranked alphabet obtained fron = by Wuabtrplging the ranks of all Symbets by nr (Sothat a hode may be Connected to n pies of ach of cts Subtreas), Thus 2g, = Ey for atl thro, The "Copying" Romomorptism Kk from Ts into Ty is now defined by 495 a fraez , ti laea Gy for €p4 and 2E Sg ty (a)- ALK xP BD (For example, if k=2 and M3, ther hy (a) = OX, XX XQ XX2] ). Finstl the bop-down FEE Nz (Q, 2,4, Ry, Qu) Is defined as follows. cy TH gla] st is a ule inR, then & is also in Ry Gi) Suppose that q[a[x,...xg7] at ts a rule in Rg let us denote the variables x,,x,, ta Key by Ce Ge eg Xn rewpecbivel, . Then the rate DAL Kg4 Xiq o Xee Xe I] et ts in Ry , Woheve £ 13 taken Such that c& is Linear anc Such thet tex © rece = & Ct! con be obtained by petting different Second subscripts on different occurrences of the Same Variab& in &. For instance, "fF qlalyGI) » bla,Paled la, 0%143lxeI]] /s in R and M23, Hen Wwe Car put the hile BEAL Ny Xa Xys Kee Xue Xe > bLa [X41 ed Cag [Xqy 143 [2,277 in Ry. ) Obviously , if M is deterministic than So is N. A forual proof of tha fact Hat Mo h-N is Left to the reader ll (4.65) Example Consider the top-clown fe M of Example ¥.33 and Cot us Consider tbs olecomypos, tion, according to the above proof. Ckeartg n=2 and Usrefore the definition of Hh for, for example, + 2,7, ie ee 196 a and 6 Is AC) es tly Xe T) Ay lH) = * OGM 7 Ay(-)= - 044, Rasoa and ty(blab. Thus, for exauph, h(#(+ [ab] -(a})= ¥[+laabl]+Laabb - (aa) -~faa}] - For instane the fust three rules of M turn inky the following three rules for N ; ALF OX, 4X2 X4%nd] = +falxJalrerd] 5 ACH [Xia Xo Xan Xn] > +L Lak] ebe sD ¥[elk,14D%27]) » ql-l[x4%23] + -[alx,,])- Tt ts Left to the reader to See how N processes &(*[+[ab]-[4]]) . W ance Sine LTS LB CTheeren 4:48), we Know abreade how to clecompase LT This gives us the following resutt. y.66) Geretlary : T ¢ Hom. LB = HomM-REL-~FTA-LHOM . ll Mobice that the lnckuson ts proper bq lemma ¥.¥b. From tas Corcllarg we See that each trp—cown tree trans formation Car be vealized by the Composition of tue bottom-up bree transformations (ef. Theorem 4.y3) Anothar body expressing cur lécompositon resutts Concerning & ant T is by the equation B* = T*= (REL UF TAU HOM)” By the aboe Corollary aud Remark y.br we obtain ther REcOG is Chosed under inverse top-down Wee tansformations, aud in par btutar . (4.67) Corcllary. The domamn of « top-down tree transformation Is fecognmeable. MW 177. The next theoren Wags , Qnatogaust, bo Theory y 52, hat each deternimishe clement of LT car be eecomposed into bw. Sthapekey (cleteruinistic) ones. (y.68) Theoren, LDT © DT@REL - LHom Prost. Bee the proof of Theorem, eZ We now note Mat Wwe Cannot cbttin very twee resudts Ahout hesure under Composition analogous to these of Theorenr eeestst— ty, is an Ordinary top-down ftt rule ane D is a mopping from Ky into PUTE) (where E's the humber of Variakkes in t,) Such that , for 4scsk, D(x; ) € RECOG (Whenever Dis Undarstooc or welt be speccfied Cater We write t,o rather than (4b, D). We cat & and te he Loft hand site and right had Side of the rele respectivel a. M ts viewed as a tree rewriting System in the obvious way -( (tot, D) being a “rule Scheme” (4, %&,D) ). The tree transformation reatized by M , denoted by tM) or M, is LGA) TEXTZ | efs3 et for some ge Qy } Ws Thus a bop-down fit With reguler Sok-ahead works in exactly the Same Way as an ordinarg one , except thet the application of each of ets rules 1s restricted : the (input Sub-) trees Substituted in the rude shoulol belong to ple Specified recognizable tree Languages. Nole that for rites of the form alay>at no D red to be pea fied, (4.74) Notation. The phrase "With regular Cook ~ ahead” will Se. indicated by a prime, Thus the chass of top-down tres bransformeations weth EE oa fegular Loolkt-ahsat wilt be denotel by T. An element of T ts abso Caled a top-down” tree transformation . Mls (4.35) Exaupte . Consider the tree transformation M in Che proog of Lemma yye. Tt can be realized by tHe top-down’ fH Ni(Q,2,0,R,Qy) where Q=fa,,4}, Qa = 1,3 art R consists of the following rules : , Calx,%.I) > ofalx,1] with ranges Doy)= {mibre) | nze} ond Blade fc} QlblxI] + blatkI] with DOw)= Ts ale]l—4<¢ Note that, ih the first rele, D(x,) could as well | be Te Since is checked Cater by N that the oft subtree Contains ho a's, The essential Use of regular Cook-akead in tis example 1S of the right Subtree to {c}. and the restriction Md _ We hor define Some Subckasses of T”. (4.76) Defrrition. Lee Mz(Q,3,4,R, Qy) be a top-down’ fet. The definition «1 Linear , nondaleting aul one-Stete. Gre tdentical to the bottom up ant top-down onas Gee Definction 4.23)- M is Called (partial) doterministic (f the flowing hetas . : CG) Qy is a Singleton 2 LTP (sah. DB) ant (Sah, DO) are 42y different rules in R (with the same Loft hand Side), ther Dy0q) a Dy) = B for Some ¢, 18¢S & (where £ is tHe number of variables in s )- Wa Since the reuges Of the Variahfes Are recogni rabhe , ib can effectively be determined whether @ top-cdoun’ ftt is deterministre (If, of Course, these ranges ave effectively Speer fieel, Which Wwe atoags assume). Notation y-2y alto apphies %& T’. Thus LD" ts the class of Linear deterunnste top-down tree trans— formations, with regular Look - ahead. Observe that _ Obviousts , Te TT’ Since each | bop-cloun fet can he transformect triviatly into x bop-cloum’ Fte by specifying alk ranges of att variables in alt rules to be the (recognizable) tree bavguage TE (whore Eis the input atphabet). More - | our, if % is © modifyer , ther ZT © a1. Tt Can easily be Seon that Thaoreus y.y3 @ud ¥Yy Shel tate with T replaced by T’, In fact, the proofs of the theorems are true without further change. Iw the next teoren We Show that the regular hoot ahord Cn be “taker out” of a bop-coun’ Ftd. (4.72) Theorem. T’ © DBQREL > T and ZT’ < DBQREL- ZT for eft, D, LD}. Proof. Let Mz(Q,E4R,Qy) be in T" . Consider all Yecognizable properties which M1 heeds for tts Look-atoa (that is, all recognizable tree Grainy Languages D(x; ) eecurring tn the retes of M ). We anuse a cleterministic bottom up finite State rehabeling bo check, for a given input tee t, Whether the subtrees of t Mave these properties or not , aud to put at each node of t a (Pinte) Atuount of information telling us whather the direct Subtrees 4 Gat rode have the proparbes or ret . After this rela- bebing Wwe Can Use an ordinary top-down Ftt to Simulate ] , because the brok-ateal information is how conta tn the Label of each Noodle . The formal Construction mught Look as Fotlows, Let Lys by, be te recoguizable tree Langucges occurring. as ranges of variables in the rites of M1. Let IL denote the Set fo,1} , that is, the set of ove Sequentas of 0's and 4's of length n. The gM element of we wilt be denoted by ut. An element u of Will be Used to indicate Whether a tree belongs to “yp aly Or rot (uta 4 ifq the tree is in ue I We now introduce a naw ranked alphabet J2 Such that 2, =F, aud, for tot, w= Eex uk. Thais an element of 2p is of the form (a,(u, with AEZe, and Uy, ye € U. TH a hoe ie Labeled by Such a Syuibot te Wilh Mean that uw: Contains all the tn formation about the cH sub trea ot. the Hode. Mext we detine the Mapping f: TS 3 Ty as fotlows } : @) foraes, , flajrai ; a ed) 126 (iy) for @et, acm, ond %,,..,% € TS ; fCAlG GD) = BL FH). fC) , Where b= (a, (4,6) aud, for deck and tsyen, wet iff tel Lt is Loft as an exercise to show that f an be realized by a (lotat) deterministic bottom-up fincte state relabeling (giver the deterministre Cottow—up fta’s recognizing Ly,...,L, )- We now detine a top-down fit N= (Q,2,A, Ry, Quy) Such that CO) "fF glaj at ts in R, then i is in Ry W if (qlaly. 4 >t,D) sinR, then each rule cf the form q[(a,Z)[x,--xg]] > t ts in Ry, Where = (4, vue) €U* and & Satrs fies the Condition: (f Dxy;)= Ly then 1 Chralt C anol p, 1st , 48 y¢eN )- this Comphotes the Construction . Tt 1s obvious from this Construction that Mz f-N. Moreover, rf M is Linear , then so is N. Tt is abso clear that , in te Construction above , the set UW Way be replaced og. the Set {uel | forall 4, and 4, if Ls alys¢, then ut ult £4 3 (note that His influences Ry + rates Containing elements not th this Set ave reuoved One Chin now easily See that if M is cletermun'stre , then So is N Ml uf = é fr iumediete Consequince of this theeren, and Previous decomposition results ts that each element of T’ is decomposable into elements of 422 REL, ETA ano Hom. (Y. 78) Corollary The clowain of 2 top-doun ftt wilh regular look-ahead Is Pecognitable Proof. For instance by Remark 67. ll Another Consequence «f Theorem 7% is that the addition of regular Look-atead tas no influcnce on the Surface tree Languages - (4.79) Corcllarg. T"- Surface = T- Surface and DT’- Surface = DT- Surface. Prowg. let L be a (DIT Surface tree Aanguage so L=MCL,) for Some Me (DIT! and 4 € RECOG. Now, by Theorem 4.73, M=ReN for Some Re DBQREL and Ne(D)T. Hence L=N(RCL,)). Since RECOG is closeel under Linear bottom-up tree transformations (Corellary 4.55), NCRCL,)) is a (D)T- surface. tree Language WA We now show thet, tn the Linear case, thore is ho aftferena between bottow—up and op-down’ trea transformations (ate properties (B),(T) ana (BY) are “eliminatet” ) , cf Theorem ¥.48. 48 Proof. First of ate we hawe (ie G DEQrE = La. (y.80) Theorem. LT’ = | Chores 4.77) j 428 <¢ DBQREL - LB (Theorem 4.48) els | (Theorem 4.54) . lek us now prove tat LB S LT’ The Construction is the Same as ih the proof of Theorem y.48(2) , but now we use Look-akead in case the botton-up fet is deleting. Leb Mz(Q,2,4,R,Qa) be a Linear botton-up ftt. Define, for each q in @, Mg to be the lottem-up ft (Q,S,4,R, {a}). Construct the Rivsaw top-down’ ftt N=(Q,2,4, Ry, Qa) Such thar a) if arql[t] isin R, then q[az3t is in Ry 5 Wy) if of, Bul ---aeDe7] > aft] is in R. than the rule qfafxy eT] E<% 9, yD, Me Me EID is im Ry , hire, for tecsk, Vf %& does net occur Int , then D%j)= dom(Mg,) , amd Diy= Te otherwise Note that dom(Mg,) Is. recoguzable bg Exercise 4.29. Lt Shoule be clear that T(N)= T(M) fl From the proof of Theorem Yo ue follows that each element f LT’ can be reabizet by a Ginear bop- Lown “ Lit with the property thet Lock-ahead Is only used on subtreas which are deleted | Thi property corresponds precrsetg, to property (84) lek US new Consider Composition of bopdown” Ftt Prralegoust, to the bottom-up Case , Wwe Can now expect the results in the next theoren, from property (8) 12g (4.8) Theorem . OQ foie Gi Dies fe 6 fee Gn De DT Ge Oe Froof. As in the bottom up case, we only consider a number of Special cases lemma. T’e LHoM © T' oud DT’-HoM © DT’ Proog. let Mz (Q,E,A,R,Qy) be a top-down’ ftt aud ba tree Romomorphisin from Tg into Ty. We Construct & New top-down” ftt using the oft (loa of eppging the Romomorphism to the right Rand Sides of the rules of M. Therefore we extend & tp trees in Ta (QUEL) by definng A ixl=x for xe KX ant £,(a)= 9[x] for geQ, Let, for geQ, Mg =(Q@24,R fa}) . Note that, by Corollary y. 78, dom(Mg) is recognizalle. Gonstruct now N=(Q, =, 2, Ry, Qy) such that ©) tf a, la] +t vs in R, then a,fat—s R(t) is in Ruy Cy Z(gtlae sg) ee Dy) Re thee Cg, Lely x1] > Ret), D) is tn Ry , where, for tees k, Boy) is the intersection of Dlxj) and all tree Langurges olom( Mg) such thet qm] occurs tn t but not in At) Thus N Simubbeneous la, Simutates M anc appties to the output of M. But, whenever M Starts making a transtation of a subtree t Starting in stute @ jana this Cranskation Is deleted by R, N checks that € Is transkatable by My. Tt & is Cinear or M ts deteruumistie , then 130 N= Met . Obvieustg, LIP M is deterministic | then Soe Ne U Lemme T’s QREL © T’, DT’. DT QREL S DT’ ance Dr’. DBQREL <= DT’. Proof. Lt is not df ficust to seo that, given MeT’ and a top-down finite State relabeling N , the Composition of thase tue Can be realized by one bop-down ftt K, which at each step Simubta- neously Situtetes M and N by trans forming tha. output of M according to AN The Construction és basicaths the Same as that In the Secon Lemma of Thooren 4.532. Tt is abso clear that if M and NV ave both deterugmistre , then Sots K, This gives us He first two Statements of the femma. The rel due IS more ol Frawe Sine the finte state relobeling, is now bottom-up. However the Same Linol of Construction Can be applied again, using the Look— ahead pouty to make the Felting topdoun” tt deterministic , let M=(Q,Z,4,R,Qy) be in DT’ and tet Nz (Oy, A, 2, Ry, Quy) be in DBQREL. let, as usuak , for g€Q, Mg -(Q,E,A,R, £43) and, for ge Qy, Ng = (Qu, &, 2, Ry, ta}). We sheath realize MeN by a deterministic top-down’ ftt K = CQKQny 2, 2, Re, QAxXQuy) , Where he Set Re of ruses is determined as follows . @) TE alay +t ts in Ry and ts pce’), Hen (a,p)[a}y >t! is in Re 137 Cy et (aia ee De R Ther t can be writter as ee SX 29, 0%, 7, sy Xm & Im EX, 7 > tor Certain mao, se% (Xm), Bys 4 In € Q ancl Kyo My, € Me bet Py, Py be States of N. Compute OF possible) SS) PDL) Xm = Pn Pm I> 2D p, [5°], where Pb €. Qy md S°E€ To (K,). eXtended th the Usual way ) Then the nde (OF Course N was Just (45, Po l@[X eI] > 864 (4, PID%, I, = Xm & (Ay Pm EX, 1 > 's th Re , where the ranges of the variables are Specified by D as follows. For aeusk, Dex,) ss the intersection of Dix) and ate tree Languages dom (Mg, + Ng.) Such that Xj, = X,. This ends the Construction. Trtuctively, wher K Orrives at a hode tn State (q,,p,), tb Computes the Piece Cf output of M and ther runs NM on thes Cutput in the reverse (namely top-down) cerection , Starting in State p,. Cbviousdey , feversel of NM gives a Noncetermuinstic Wanskation of this piece of Output however by regular Look-ahead we Can pick out exactly one transketion, Ta fact, the regular Look. aheat cleterinines , for each subtree ty, the (Uergue) State in Which A wilt arrive after branstation of the Mg - transkation ef t, (for Severek gk). Lt is Straight forwaya bo chede forinadly that Kis a deteruinisthe top-down” Ltt (Using the ceternumin, OL both Mana WN), Ml oe We now conphete the proof of Theorem Y. 84 Furst lege fe 2 fare Ce (Theoren, y. Bo) © TT. QREL- Lom (Theoren, y.52) ¢ tT’. LHomM (Second emma) cc - (frst Aemma) . Eeconditg DT’. (DT’ ¢ DT‘ o DEQREL - (D)T Die CD) de DT’. Hom - L(D)T (Theorem 4.74) (second .tewmea) (Theorem 4.64) Die =| LCD) (frst Lemma) . Now DT’ e LT ¢ T’ (by (4) of this theorem) and DT’. LDT: S DT’. DTQREL« LHoM In in In IN (Thm. y. 68) ¢ Dr’+ LHoM (Second femma) c Dr’ Cfrast Femma). This proves Theorem Y. 82, ll Note that the “ght hand Side” of Theoreny 4.94 (2) States that DT’ ts chosed Under Compo- Srtion Clearly. we know alread, that LT’ I's closed Under Composition (Sine LT’= LB). Tt con alse easity be checked from the proof of Theorem y. 94 that LDT’ ts Closed Under Composi tron We can now show that thdead regular Looc- alead has made DT Stronger than DB. 433 (4 82) Gorolary DB ¢ Dr’ ‘ i Frood.. By Theorem 9.82, DBS DB@REL - Hom, Hence , Since the. identity tree transformation is in Dre we Orivetl, Ke DBCS DT’. DBQREL - Hom, But by the Secona ane frst Lemma th the proof gn Theorems 4.84, DT". DBQREL» HOM © DT’. Henn DBS DT’. Proper inchision follows from lemua Y.Z2. VW (4-83) Exercise. Show that each Tt Surface tree hanguage is the range of Some element of T’, JY) (y.8y) Exercise Prove that T- Strface is closed Under Linear tYee transformations (recall Corcilary 4.74 ). Prove thet DT- Surfacw ts Closet under deternanistic Cop-down anol bottion up tree trans— formations . MW Tt follows from Theorem 84 that the inclusion Signs in Theorem 4.77 may be replaced ey egualitey Stans. Hence, for example, DT’ 2 DBQ@REL» DT = DB + DT. We Finally show a Pesubt "deae” bo Corollars 463 Crecate thet LT’. LE) (y.85) Theorem. Proof es T= The inchusion Theorem 84 Hom. LT’. Home LT’ Co T! ts tumechate The inckusion T’¢ Hom. Lr’ Can be Shown iin exact the Sang wag as in the Proof meer 43y of TS Hom. LT C Theorem .6y). The onty problenr is the Vegular Rook -aheat » the Mmage of a vecognizable tree Language Under the homo - weorphisin Ko need not be recognizable (We use the notation of the proof of theorem Y. by). The Solution is bo Considar a fomotmorptin g from Ta Into Ts To ee fee eine, 40r Ce) 6 aS easy to find. Now , Whenever , in @ rute of M , We fave a recoguzath tree Cuguage L as Look - ahaad Cfor Some. variable), we Canuse g(L) as the book - ahead in the corresponding rule of N. The detatls are oft to the Feacler W 135 (y. eght), Surface anc target Languages Ta this Section we shall consider a few properties of the tree (and string ) kanguages which are Obtainable from the recognizable tree Languages by appeication of a finte number of tree transducers. Tn other words we shall consider the chasses (REL UFTA U Hom )* — Surface (briefly denoted by Surface ) , and (REL UFTA uv Hom)* — Target Chrietly denoted by Target ) Note that Target = meld ( Surface), Note alse thet, by Various decomposition lesults , (CEC URAC HOM Je Gay 6 GU). ee Let us first consider Some classes of tree Languages obtained by Festricting the number a transducers appGed. Tr particular, Let us consider _ for each kod, the classes TS Sturface , (TR Surface onc BY. Surfaw. Obviousty, by the above remark, Sturface = ) T*. Stuface = COR Surface = (LS Surface Asa corcllary to Previous results we can Show that regular Cookahead has ho infkuence on the class of Surface Languages (cf. Gorcllerg 4.79) . (4.86) Gorellacy. For ath &p7, (CPOs | Decker se Wi) (TYR_ Surface = TR Sturface , anol Wi)’ TE Surface is Chased under Sincar tree transformations, 136 Ci) By Theorem 4.73, T'S DBQREL-T . Akt, by Corollary y 82 and Theorem y.87(2), DBQREL *>T ST’ Hene T= DBQREL - T. We now show that T’oT’ = T+ T. Trivially , Tee eC fe fe Ao ee = fee DeGeee = fe axe by Theorem y.87(1), T's DBQREL ¢ T’. Hene, ec i Frou: this ik easity follows that (TT. Te TF = | DeOrel ie Ge | DEQeer = Gee Gi This is an Immediate Consequence of (6) and the fact tat RECOG is closed under Ginear tree trans formations Ue) Ths follows easily from (ui) and the fact that (TYE ts chosect under Composition With Linear tree trans- formations (Theorem ¥.84(1) , Teall also Theorem y-%). Hf We nuntion here that it can alle be shou that Th Surface is Chose under Union, tee concatenation and tree concatenation CLosure . From that a fot 1 closure Properties of TH Target and Target follow. The relation betwoan the top-down ard bottom —up Strface tree Maguages is easy. (4.83) Gorollary, For otf ee & Th. Suga = (b° +18). Surface ant BY". Surface = (T+ Hom). Surface (i) BE Sheface G Te Strfase G BAM Slurface . Proof (1) follows from the fact that B= LB-HOM (Cofellary 4.63 (2)) and thet T’. HOM- LB (Theoren, y. PS) 137 (8) is an Obvious Conseguante of (1). Wote that 8 -Sturfzace = HOM-Strface (Exercise sb), Tt is rot Mrown, but conjecturect , that for 202 & the inclusions im Corollary Y82 (ct) are proper. Note that, by taking yields, Corollary 82 2hso hetds for the Corresponding target Auguager. Again % Is hot knolon vohether the tnckusions ere proper Dn the rest of this Section Wwe show that the emptrress - _, the moubershiy - cua the finiteness - problem are Solvable for Surface aie Target. (y. 88) Theofeny, The euptiness- and membership —probles are sotvable for Surface. Proot. Let Me (REL U FTAU Hom)* and L € RECOG Consider the tree Language M(L) € Surface . Obviousty, MiL)= DB Yf Londo (M) = pg. But, by Remark 4.64, dom(M) is recognizable Hence by Theorens 3.32 ancl Thoorenn 3.74, 4 is ceciclable whether Lndom(M)=p To Show Sotvabibity of the Membership -problem note. frst that Surface is osect under intersection with a recogizable tree kaguage (lt RERECOG anol Ris the fla Mapping Such that don(R)= R , ther ML) NR = (MeR)(L) ). Now, for any tree t, te ML) tf MiL) ait} # A. Sine It} is recognizable M(L) a §t} € Surface, trol we just Showecl that & is decidable Whether a surface tree tanguage Is empty oF not. é 738 To shew dacidabilcty of the firctensss- problem we Shall use the. following result. (4-89) Lexma.. Each monace tree Language in Surface Is recognizable (aud hence regular as a Set of strings ) . Prog. Let L be a Monacte tree Language . Obviousty & Sutfios t& Show that, foreach kod, iF Le(rU*. Surface, ther Le (TYP. Strfac (where, by definction, (T/)°- Surface = RECOG ). Suppose thre fore that Le (TY Surface, so Ls (Mose Mey Mel (RI for cortain M; € 7’ and RE RECOG, Consider atl night Rank sides of rates in Me . Obvioushy , since L ts monachic, these right Ran Sides do not Contain Clemarts of rank 22, that is, thay ave monadic in the Sense of Notation y. G2 (rakes which have nronmonadic right frac Sides tay be rewove). But from this ut follows that Mg is Linear. Lt now follows from Theorenr ¥.87(1) (and Corollary Y.SS aud Theorem 4.80 in the Case kaa) , thak Le (T)*-7- Surface . UW (y.go) Thooreur, The finiteness - problem. /* sotvabher for Surface. Proof Trtuctively , 2 tree Language it finite '[ dud only if the Set of pets through this tree Language If fine. For Lic Te We define peth(L) ¢ S* recursively as follows © for a€% , path(a)= fa} 7 ty for Ret, eRe wa Gy KEE, | . 93g path (alt,-..te]) = fa. (path(%) UU» U path(t)) 5 Gi) fr Ls Ts , path(l) = path(t) . Thas , path (t) Consists of alt Strings which Can be cbterned by following & path through + ( for instenn, /f t= albbfecy], tha. path(t)=-$ab, abe} ), We remark here that auy other Similar defiration ©] “path “ wouls abo Satrsfy Our Purposes. Lt is ft te the reader to shew that , fr any tree Langurge L, Lis fine iff poll) is fice. We now Show that, giver LS Te, the St peth(L) can be obtained bey feeding L into a top-down tree trans- duce M, : M,(L) Wilk be equal to path(L) , module the Correspondence betwen, Strings anol Monachz wees ; See Definition 2.27. Tr fact, =(Q2 ARG) | tohere. Q=Qa= 4p}, Ae fe], Ap2E aut R consists of the following rules ; ) preach 2€5,, play » ele] Sn R ; ud bor every K24, every ace anel every ¢, 7sisk, PlALx,-.. 7] —» ofpleqy is mR. Consequontty, Lic finite iff M,(L) is finde Now Lot L€ Surface. Thin , obviourty, Mp (4) € Surface Moreover Mp(L) is Monachic . Hence, by lenma 4. &y, MCL) ts recognizabte. Thus, bg Theorem 3.75, ut is cecidable whether M,(L) is finite, Wa To Show that the above mentrowed problems are Skvabl, for Target , we need the following Cemuna +yo (4.91) Lewng. Fach Language ti Target is (effectively) of the form yiell(L) or gielt(L) U{ry , Whee LS Ts for Some EZ Such that S =P onl e¢ Z, Le Surface, Prowt. It is Coft as an exercise to Show that , for ary Loe Ter, Hore exists a bation bree tranetucer M such that M(L') S Te for Soma S Satisfying the requirements , and Such that yiebl(M(LI) = giebl(L’) - 193. It is abso Loft as au exercise to Show that it is decidable Wwhotkar Ae greldCL’), Fron these tw facts the hesune follows MU (4.92) Theoreur | The. Ceuptriness —, membership — fircteness - problen, are Sotvable for Target. Proog, Te is olvious from the previous Rewure. that we snag Festrict Oursebves bo target -Languages yreld(L), whore Le Surface ant LS Ts for Some SE Such that Bf wel O ¢ B. ard Obvicusky , yrelt(L) =p ‘ft Lag. Hence. the emptiness - probler, is Sotveble by Theoren, Y. 88. Note that, by Example 2.77, fora given We Sy thare Gre Oth, a finite murber of trees ¢ Such that gielel (t) = Ww. Frou tus ack Thooren, y.8? the Aecidabibity of the menbership -protlen follows. Moreover + follows that yiete(L) is finite iff L i finite Hence, by Theoreun y.go, the fircteness- problem is So@vabte MW We note thad it Can be Shown that Target ‘s (property, by Theoren, 4.92) Contained in the Class of context. sensibve Languages - Thus, Target Gres propert, Ws betwee, the Context free ancl the Context - Sensitive Languages We Pinathy note that, Ina Cartan Sense, Surface S Target (Cf. the Similar situation for FECOG aud CFL). Tx. fact tot Lc Te be in Surface. let Load J be two naw Symbots (‘standing for [and 7”). let A be the Mauked alphabet such that A,- vil, TI, Ayoh, 2 Ose wd Degg = Ae for bo. let Mz(Q,E, 4,6, Qy) be the (deterministic) typ-doun tee transducer Such that Q= Ga= ff} the following ( fr Ret acy, FlalK.Xe]] > elal flx,] fle] ] Hk ly frae®, fla] >a 6 mR. (Note that M is in fact 2 Romonurphism ). Lt is Bft te ta reader to Show that gelt(M(LI) = Lele L,J Cas String Konguages ) . aud the rales ave 142. 5. Notec on the &cterature Tn he text there ave some references to [Aa A.V Aho and TD. Ubeman , The theory of parsing, transkation and Compiling, I and I, Prentice-Hall, 1972 [Sel] A. Salomaa , Formal Languages, Academic Press, 1973. An informel survey of the theory of tree automata and tree transducers (uptil tao) is giver bq [Thatcher, 1473] On section 3 Bottom-up finte tre automata were Invented Around 1965 mdependertty by [Doner, 1965, 1g72] ancl [Thatcher @ Wright, 1968] Cand Somewhat tater by [Pair e Quere , 1968] ). The original aim of the theory of tree automata was to apply i to decision problems of second-order Logical theories Concerming Strings . The Connection ith Context-free Languages was established in [Mezei 2 Wright, 1967] anc [thatcher, 1967] , wa the iden to give “tree-oriented” proofs for results on Strings is expressed in (Thatcher, 1973] and [Rounds, 19700] - Trdepenrdentty , Tesults concerning parenthesis Languages aud structural equivalence were obtained by [McMaughton, 1463] , [Knuth,196], (Ginsburg & Harrison , 1gbz] amet [Paute # Unger, 168], Top-down fincte tree automata Were introduced by [Rabin 196g] aud [Magrdor x Moran, 1964) ance, regular tree grammars by (Brainerd, 196g]. The notion 143 of rule tree Ldnguages occurs in one form or another in Several pkaces in the Leterature Most resutts of Section 3 can be four in the above mentioned. Papers Other work on fircte tree autouate and leloghtabe tree Canguages is written down tn the following papers : [Arbib & Give'on, 1468] , autometa on acyclic graphs, category theory j (Brainerd, 7968] , state minimalzation of finite tree automata ; [Costich,1972] , t with AeN, wad te Ty or Aly. Xe] a» + with Ae Np and te Ty (Xe) + G is Consideved as a tree rewriting System on Ty . 104 Tf we fet ath variables in the rules Prange over Ty thar G Is called a context-free tree grammar. If we Lot alk Variables range over Ts , thin G is called a fottom-up (or inside-Cuxt) Conterxt-free tree grammar The Language gerereted by & is UG)efteTs |S St} Ti gawrnk Ha Longuages generated by G tndar the above. two interpretations avffer (Consider for instance the. grammar Si FLA] , FIX] >flxx,], Ava, Ars) . Thas restriction to bottom-up (% right-most in the String ~ Case) gerwration gives another Language. On the otkar hand, restriction to top-down (x feft-most) generation Can be done without changing th teguage. The gield of the class of content-free tree Languages 1 equal te the cla: F indexed Canguages . The yield uf the class of bottom—up Context. free tree Languages i's cabled IO. These two chasses are incomparable ( [Fischer, 7468 ]) How do these classes Compare with the target Conrguaces? Ts & possible to berate the CFL sRECOG procedure aud obtain fesulls about (bottom-up) Context-frea tree Lengunges from regular tree Liagnages (4 & “higher orden’ (See [Maibaum, 19741). Ts thee any Sente th Considering pushdown tree automata? — General computabitety on trees ; CRus , 19671, (Mahn, 196g], te Vienna method . als woebking automata (at eack Moment of time the fruite. automaton 6 at one rode of the free j depending on ves State ano the Label the node + goes to the fathir node or bo one of the Son Nodes ): [Ako € Ullman, 7937], | i ' i CMartin 2 Vere, 1972] TTT 198 Tree adjunct grammars (another Cinguistically motivated, weg of geurating tree Languages): [ Joshig levyx Taka ~ faske , 1933] Lindarmager tree Systems (parallel revoriting ) : CEubie, 1974] , CCuOie « Maibaum, 1974], [Engelfriet, 1974 ] , ([Sefard, 1979] Bibliography . A.V. Aho aud J.D. Uleman , 1977 Transkations on a Context. free grammar, Taf. & Control 19, 939-475 5. Akagie , 1973 Natwal state transformations , Techn Report 43B-2 , Univ. of Massachusetts at Amherst M.A. Arbib ancl Y.Give'on, 1968 Algebra automata , T&T, Taf. Couto 72, 331-370. B.S. Baker, 1973 Tree transductions and famities of tree Languages , Report TR-g-73 , Harvard Universi bg (Abstract in: St Theory of Computing, 200 - woe), D. 8. Benson , 1970 Syntax aust Semantics: & categorical view , Taf. @ Control 13, 145-160. D.B. Benson, 1477 Semantic preserving Granskations, Working paper, Washington Stite University, Washington 9g E. Bertsch , 1973 Some Considerations about chases of mappings between context - free derivation Systems , Lecture Notes in Computer Sciene 2, 278-283 . D. Bigrner, 4932 Fincte State tree Computations , IBM Report RI 4053 W. $'. Brainerd, 1968 The Mrnimad zation of tree automata , Th. & Conteh 73, y8y-4g7 W.S: Brainerd, 196 Tree generating regular Systems , Tht. & Contro€ ty, 244-237. W.S. Brainerd, 1969 a Semi-Thue Systems ancl representations of trees, to* SWAT , 240-244. H.W. Buttelmann , 1971 On generalized finite automat, and Unrestricted generative grammars, 3¢ Theory of Computing , 63-77. O.L. Costich , 1932 A Meduehay chornoteritetion of Sets recognized by Generalized fincte automata , Math. Syst. th. 6, 263-267. S.C. Crespi Reghteetyr anol P, Della Vigna, 1973 Approximation of phrase markers by regular Sets Automate, Languages ark Programming (ed.M. Nivat), North Hathank Publ. Co. , 367-376. 150 K. cule IL, 1474 Structured OL~ Systems , L- Systems (eds. Rovenbeg ¢ Saboraaa) , Lecture Notes in Computer SCience 15, 216 -22g K.Culie I and TSE. Maibaum, 1974 Parallel revoriting systems on terms , Automate, Languages ano Mogrammaing (eo. Loeckx), Lecture Notes in Computer Sciente ty, ygs- SU J. Doner, 187° Tree acceptors anc Some of ther apphications, T. Comp. Syst. Sc. Y, yok Yyst (announce in Notices Am. Math, Soc. 12 (1965), 874 as Decidabibity of the weak Second-order theory of two Successors ) P. J. Downey, 1974 Formal Languages aud recursion Schemes , Report TR 16-74, Harvard University . S.Ekarberg ad JB. Wright, 1967 Automata in general algebras , Tnf. 2 Contro€ 14, yS2~Y70- C.A. Ellis , 1970 Probabjaistic tree automata, gre Theory of Computing , 198-205 , ae Engelfriet, 1937 Bottom-up del top-down tree trans formations -— a Compari Sor Memorandum 19, TH. Twente, Holland (to be publ. in Math, Syst. Th To Engetfriek, 134% A note on infinte trees, Inf. Proc. Letters 1, 229-232. | | | 187 T: Engel friek , 4874 Surface tree Languages and parable dorivation trees , Daina Report PB-4y, Aarhus University, Denmark MoT. Fischer , 1968 Grammars with macro-like productions, gth SWAT , 4131-142 (Doctoral dissertation, Harvard. University S.Ginsburg and M.A.Harrison, 196 Bracketed. Coutert-free Languages , TiCoup, Syst. Se. 1, 1-23. TA.Goguen and TW. Thatcher, 1974 Tukial algebra Semantics , as SWAT TM. Hart, 7974 Acceptors for the derivation Languages of phrase—Structure Granamars , Dez. AContro& 25, 75-92 T. Tho and S. Ando, 1974 A Complete axiom system of Super-regular expressions , Proc. LF IP Congress #y, 661-665. AK. Joshy, L-S. levy and M. Takahashi , 1973 A tree generating System , Automate , Languages art Programming (ed. Nivat), North Halbert Publ. @., 93-465. DE. Knuth , 196% A characterization of parenthesis Languages , Tnf.2 Contro 17, 269-289. D.E. Kuuth, 1968 Semantics of context-free tanguages , Math. Syst. Th. 2, 127-145 152 (see also: Correction tn Mazh, Syst. th. 5 (1971), 95-96, dud "Examples of formal Semutics” in Lecture Motes in Matheuatics 188 (ed Engeler) ) . S. KoSaraju, 973 Context. Sensitiveness of translational Languages , 9th Princeton Cont. on Tint. Sci, ana Syst L.S. Levy and AK. Joshi, 1973 Some results in tree automata, Math. Syst. Th. 6, 334 - 342. M.Magidor an G.Moran , 1964 Finste automate over finite trees , Techn. Report No. 30, Hebreis Wuversity, Jerusaton | M.Magidor anc G. Moran , 72, Probabrbithic bree automata and contert free Languages , Ttrach J. Math. 8 , 3yo-3y8. F.K. Maha, 1969 Primertiv—rekursive Funktionen af Terimenrgerr, Archiv f. Math. Logik und Grundkagenforshung 12, 5y- és TS. E. Maibauu , 1972 The characterization of the derivation trees of Context — free Sets of terms as legular Sets, 13 SWAT , 224-230. T.S.E Marbauur, 1974 A generakized approach to formal Languages , JT. Comp. Syst. Sc. 8, Yyog- 434. DE Martn aid S.A.Vere, 197° On sqnkax directed transduction and tree,- Fransducars , gna Thoory 4 Computing 129-135. | 153 R. McNaughton , 1962. Parenthesis Grammars , Journal of the ACM, ty , ygo-Se0 J. Mezei anol J.B. Wright, 1962 Algebraic Automata ard Context free Sets , Tap. £ Control 14, 3-29. D.E. Muller, 1960 Use of murktiyl, index matrices in gentrabized Automata theory , g SWAT , 295-yoy. WF, Ogden md WiC. Rounds, 1972 Composition of n tree trancducers , yt Theorg of Computing, C. Pair and A. Quore, 1960 DE finction et etude ces bilangayes regubiers, Taf. £ Control 73, 565-593. M.C. Paull and SH Unger, 1968 Structural equivalence of Contert-free grammars, Ti of Comp. Syst. Sev, 2, 427-463. M0. Rabin, 7965 Decictability of Second-order theories and automata on infinite trees , Transactions o4 the Am. Math, Soc. Wyt, 7-38. F.L. de Remer, 1974 Transformational grammars for Languages ere Comp Lars Lecture Notes in Computer Science 27 sy G. Ricci, 1973 Cascades of trer-automata arel Computations iin universak algebras , Math. Syst th. 2, 207-218 B.K Rosen, 1977 Subtree replacement Systens , Ph.D. Thesis, Harvard University . B.K. Rosen, 1473 Tree Manipulating Systems anol Church - Rosser thooreus Journal of the ACM 20, 160-198. W.C, Rounds, 1g60 Trees, transducers and transformations , Ph. D. Dissertation, Stanforh University . W.C. Rounds, 1969 Context free Grammars on trees , + Theory of Computing, 43-148. W.C.Rowds, 1972 & Tree - oriented proofs of Some thaorems on Contert— tres and indexed Languages , rel Theory of Computing, 799 - 116, W.CRouwrds, 1970 & Mappings and grammars on trees, Math. Syst. Th. 4, 257-287. W.C. Rounds , 1973 Complerity of Fecogution th intermediate -Loel Cngrages wy SWAT , 195-158 T. Rus, 41gby Some observations concerning the application of the electronic Computers in order bo Sotve hon ari huneticaL Ppeersrosee 45s problems, Mathemabica g , 343-360. C.D. Shephard , 1969 Languages in general algebras ast Theory of Computing, 4s5 163. A.L. SHkard, 1974 LOL Systems , L~ Systeurs (eas. Rovenberg ard Salomaa), Lecture Motes in Computer Science 15, 258-297. M. Takahashi, 41972 Regular Seks of Strings, trees and W-structwes , Dissertation, University of Pennsylvania . M. Takahashi, 1973 Primitive trensformetions of reguilar Sets anol recognizable Sets, Automate, Languages duck Programming Ced. Mivat), North Holland Publ. Co. , YFS- 48. Tiw. Thatcher, 1967 Characteriting derivation trees of Content. free Grammars through a feneratirabion of fncte autouate theory, J. Coup. Syst. Sci, 4, 392-322 T.W. Thatcher, 4970 Generabized* sequential machine Maps , T. Comp. Syst. Ser. Y, 33g - 363 (also IBM Report RC 2466, abso pubhiclat in aoe Theory of Computing 25 eee aul translations frou, the point of view of generalized fiucte automate theory’) - ’ ase gw Thatcher, 1973 Tree autouata : ar informal Survey , Currenks in the Theohy of Computing (ed. Ako), Prentrce- Halt 143-442 (Also published in Yo Princeton Conf. on Tez. Scr. and Systeus, 263-294, as" There's a Cot more to finite automate theory thon you would hae theught ” ). FW. Thatcher aud J.B. wright, 1468 Generalized finite wtomata Cheory with an apphicad'en to a decision problem of Second-order Logie , Math. Syst. Th. 2, 52-97. R. Turner, 1973 An infinite Merarchy of tern Languages — Approach to Wathematical complerity , Automate, Linguages auc Programming (ed. Mivet), North. Hothart Publ. Co, , 593-608. R.T. Yeh, 1977 Some Structural properties of generalized automata anck algebras , Math. Syst. Th. 5, 306.318 .

You might also like