You are on page 1of 170
; Syntax Analysis Syllabus Syntax analysis - Role of a parser - Classification and Follow- LL(1) Grammars, No parsing. of parsing techniques - Top down parsing - First n-recursive predictive parsing - Error recovery in predictive Contents 3.1 Introduction 3.2. Role of Parser 3.3 Context Free Grammar (CFG) . cece Aug/Sept.-07, Set-2, ----- Marks 8 . May-08,09, Set-2,4, - Aug/Sept.-08, Set-1; a 5 -- April-09, Set-2, +++ Marks 8 34. Classification of Parsing Techniques 35 Top Down Parsing vesessse-.May-08,09, Set-1,2,4, - Aug/Sept.-08, Set-1; - April-09, Set-2, - January-10, Set-4, ----- + Marks 8 36 Recursive Descent Parser..............-+ May-05,09, Set-4,3; Bo seesnoanacenodos Nov.-03, Set-3, +++ +++++ Marks 8 37 LL(1) Grammars 48 Non Recursive Predictive Parsing 49° Strategies to Recover from Syntactic Errors a1 0 Exror Recovery in Predictive Parsing Scanned with CamScanner § 3-2 nt EE Ty D [EM Introduction ‘The syntax analysis asically checks for the syn feyntax) can be recognized After grouping “ recognized then syntactic error will be genera m checking of the language. 7 a " ———— alysis is a process which takes th ing, actic structure) o generates the sym phase in compilation. The syntax analyzer * fax of the language: ‘A syntax analyzer takes the tokeng euch a way that some Programming si, he tokens if at all, any syntax can ted. This overall process is called is the second A parsing or syntax an | Definition of parser = either a parse tree (synt |string w and produces errors. For exampl b+10; | | | The above programming - | statement is first given to lexical 2 j analyzer. The lexical analyzer will divide it into group of a +. tokens. The syntax analyzer / \ | takes the tokens as input, and r \o | generates a tree like structure | | | called parse tree. Fig, 3.1.1 Parse tree for a = b + 10 The parse tree drawn above is for some programming statement. It shows how & statement gets parsed according to their syntactic specification. [SEE Basic Issues in Parsing «There are two important issues in parsing : i) Specification of syntax | | | | | i | | ii) Representation of input after parsing, ee Paneer pers yi fey pera ce) in parsing is specification of syntax in progra! - Speci of syntax means how to writ mning There are certain characteristic of specification of eyntae ee oe i) This specification should be precise and unambigu ; ii) This specification should be i bas e ii il, Le. it i progtnming galt Bim etal iit shout cover all te deals of iif) This specification should be complete Such a specification is called “Context F, ntext Free Gram: mar”, TECHNICAL PUBLICATIONS” * An up thrust for, knowodgo Scanned with CamScanner Syntax Analysis Compiler Design _3:3 = * Another important issue in parsing is representation of the input after parsing. This is important because all the subsequent phases of compiler take the information from the parse tree being generated. This is important because the information suggested by any input programming, statement should not. be differed after building the syntax tree for it. * Lastly the mos crucial issue is the parsing algorithm based on which we get the Parse tree for the given input. We es t0 parsing : Top-down and bottom-up, And we will study parsing algorithms concerning to these approaches. We are mainly interested in following i will discuss different approach ues - * How these algorithms work ? * Are they efficient in nature? * What are their merits and limitations ? + Kind of input they require. Keep one thing in d that we are now interes We are not interested in meaning right now. this phase. Checking of the meaning (semant studied in the next phase of compilation, ted in simply syntax of the language Hence we will do only syntax analysis at ic) from syntactically correct input will be ‘The modified view of front end is as shown below. Module Error handler interface Parser Module [Semantic] Intermediate interface | analyzer |~coae Fig. 3.1.2 Front end of compiler ESHA] Why Lexical and Syntax Analyzer are Separated Out 2 lees Demand token Input Source program’ ‘Supply token LE ROA The lexical analyzer scans the input program and collects the tokens from it, On the other hand parser builds a’parser tree using these tokens, These are two important activities and these activities are independently carried out by these two phases. Separating out these two phases has two advantages - Firstly it accelerates the process of compilation and secondly the errors in the source input can be identified precisely. TECHNICAL PUBLICATIONS”. An p trust for knowledge ss Seared with (am Scanner Compiler Design (¢) Role of Parser In the process of compilation the par s string of tol means, when parser requires s voles lexical analyzer supplies tokens to syntax analyzer (parser), Syntax Aly, a ser and lexical analyzer work together Te Kens it invokes lexical analyzer. In tye, » the Error handler Source program Lexical analyzer Parse Rest of compiler ‘Supply for tokens Symbol table Fig. 3.2.1 Role of parser The parser collects sufficient number of tokens and builds a parse tree. Thus by building the parse tree, parser smartly finds the syntactical errors if any. It is also necessary that the parser should recover from commonly occurring errors so that remaining task of process the input can be continued. (228 Context Free Grammar (CFG) XE ee The context free grammar G is a collection of following things. 1. V is a set of non-terminals. 2. T is a set of terminals. 3. S is a start symbol. 4. P is a set of production rules, Thus G can be represented as G = (V,T,P) The production rules are given in following form- Non-terminal — (V U T)* Let the language L V,TS,P ) Wheré V = (SI, T= {a,b} And is a start symbol then, = a"b" where n> 1 sive the production rules, Solution : P= prt Scanned with CamScanner Compiler Design 3-5 Syntax Analysis Sab a The production rules actually defines the language ab", The non-terminal symbol occurs need to be expanded. The terminal s language. Thus any language construct can be defined by the context free grammar. For example if we want to define the declarative sentence using context free grammar then it could be as follows, State — Type List Terminator Type — int | float List > Listid List - id Terminator + ; Using above rules we can derive, State Type List Terminator int Nh i Fig. 3.3.1 Parse treo for derivation of int id, id, id; Hence int id, id, id; can be defined by means of above context free grammar. Following rules to be followed while writing a CFG. 1. A single non-terminal should be at LHS. 2. The rule should be always in the form of LHS. > RHS. where RHS. may be the combination of non-terminal and terminal symbols. 3. The NULL derivation can be specified as NT ~ . 4. One of the non-terminals should be start symbol and conventionally we should write the rules for this non-terminal, TECHNICAL PUBLICATIONS”. An up tnust fr knowodye Scanned with CamScanner Compiler Design [EEEH Derivation and Parse Trees Derivation from $ means generation of string W from S. For constructing. Aertivaticn two things are important. i) Choice of non-terminal from several others. ii) Choice of rule from production rules for corresponding, non-terminal. Definition of derivation tree Let G = (V, T, P, $) be a Context Free Grammar. Jhich can be constructed by following properties, The derivation tree is a tree w! i) The root has label S. ii) Every vertex can be derived from (V U T Ue). iii) If there exists a vertex A with children Ry, Ro, ~ Rn then there should be production A > Ry Ro Rn iv) The leaf nodes are from set T and interior nodes are from set v. Instead of choosing the arbitrary non-terminal one can choose. i) Either leftmost non-terminal in a sentential form then it is called leftmost derivation. ii) Or rightmost non-terminal in a sentential form, then it is called rightmost derivation. Consider the grammar given below - E+E+E|E-E|E*E|E/E|alb Obtain leftmost and rightmost derivation for the stringa +b*a +b. Solution : Leftmost derivation E JN E+E E+E+E a+Ete A+ECEtE a+beE+E a+bente atpraty S07 c—m s—m Scanned with CamScanner Syntax Analysis Rightmost derivation : e Ese ol E+E+se AN AN Bet e+e els sath E+Eearn | E+beath a ba atbeath Consider the grammar SoMa LoLS|S 4) What are the terminals, non-terminals and start symbol ? b) Find parse trees for the following sentences : d @a | ii) (a, (a, a) iti) (a, ((a, a), (a,a))) ©) Construct a leftmost derivation for each of the sentences in (b). 4) Construct a rightmost derivation for each of the sentences in (b). e) What language does the grammar generate ? Solution : a) The terminals are T = {a, (, )} The non-terminals are V = (L, S} The start symbol is S. b) The parse tree for i) @, a) is Cu) “I Lo] $ a y TECHNICAL PUBLICATIONS". An up tht for hnowedoe Scanned with CamScanner Compior Dosign a ii) (a, (a, a) ¢ iii) (a, (Ga, a), (a, a): compiler Design ¢) Leftmost derivations i) ii) iii) (a, a) s @) (L, S) 5) (a, 5) (a, a) (a, @ a) s (Ll) (L,S) (L,@) (©) (a, (L)) (a, (L, S)) (a, 6, S)) (a, (a, a)) (a (@, a), (a, a))) s (18) (L, S) (S, 9) (a, S) (, @) (a, (LS) (a, (S, 8) (a (), 8) (a, (L, 8), 8) {a (S, 8), §)) (a, ((a, a), S)) ‘Syntax Analysis Rosner iran) awe Scanned with CamScanner —— Compiler Design (a, ((a, 9), (L) (a, (a, a), (Ly 9) (a, (a, a), 5, 9) (a, (@ a), (@, ))) d) Rightmost derivation i) @a) s () (L$) ¥ (L, a) \ ) (a, a) ii) (a, @,a)) s (19) (LS) (L, () (L, (L, §)) (L, a) (L, (S, a)) (L, (a, a) (S, (a, a)) (a, (a, a)) iii) (a, (a, a), (a, a) s (Ll) (L$) (L, @)) (L, (L, §) @, @, )) L, , (L, $) Scanned with CamScanner ~ompaer Design np fa oe Syntax Analysis (L, (L, (Lay) (LL (8, ayy (LL, (a, ayy (L. (8, fa, ay) (L, (1), (@, ay) i «L, CL, 8), (a, a) (L (CL, a), (@, a) (L. (Gs, a), (a, a) (L, (a, a), (a, a) (S, ((a, a), (a, a) (a, ((a, a), (a, a) )_This grammar generates all the strings with of well formed parentheses. Consider the following grammar, 15) 0)1 0107) 1100101 Solution : i) Leftmost derivation : s OA 018 010A 0104 s | 1 Scanned with CamScanner ji) Leftmost derivation + AX 1 /™ 1B Va 11s /~\ 110A a 11008 -_~ 8 110018 , 1100104 ~ 1100101 | [EEE] Ambiguous Grammar A grammar G is said to be ambiguous if it generates more than one parse trees fc sentence of language L(G). For example E> E+E] E*E|@lid Then for id + id * id m, & a——rm, {a) Parse tree 1 (b) Parse Fig. 3.3.2 Ambi Iguous grammar (EAE show that following grammar is ambi ss iguous, S — bSaS Swe Solution : Consider a string ‘abab’, We can construe ict parse es Parse trees for deriving ‘abab! t TECHNICAL PUBLICATIONS”. Anup sy or knowledge Scanned with CamScanner ‘Syntax Analy: Parse tree 1 Parse treo 2 ‘The right most derivation for abab_ is - or asbs D>—— aSbaSbs a sb ~s. aSbaSb rf aSbab abab This is a language containing all the strings with equal number of a's and bs. EEBEES Prove that the following grammar is ambiguous. $3 4B Boab Aaa Ava Bob Solution: Ambiguous grammar derives two different parse trees for the same input. Consider the input aab. It can be represented by - Parse tree 1 Parse tree 2 As there are two different parse trees for input aab. It is ambiguous grammar. TECHNICAL PUBLICATIONS”. An up thrust for knowledge Scanned with CamScanner Syntox Angy 14 Compiter Design Ss EY classification of Parsing Technique’ these parsing techniques work on As we know, there are two parsing techniques, following principle. | ee of Ant and identifies th, 1. The parser scans the input string from left to tig derivation is leftmost or rightmost. at th for choosing the appropriate derivation different approaches in selecting yy. arse tree is constructed. 2. The parser makes use of production rules The different parsing techniques use appropriate rules for derivation. And finally a P+ expanded to leaves then sy, " ‘an be constructed from root and When the parse tree can rout te pera cet type of parser is called Top-down parser. The name itself t be built from top to bottom. When the parse tree can be constructed from leaves to root, then such type of parse, is called as bottom-up parser. Thus the parse tree is built in bottom up manner, eT] SLR LALR LR parser parser parser Fig. 3.4.1 Parsing techniques Let us discuss the parsing techniques in detail, [EJ Top Down Parsing Top-down parser ] Scanned with CamScanner conpier 0289 3-18 Consider @ grammar. S$ > xPz P= ywly Consider the input string xyz is as shown below. Input buffer Now we will construct the parse tree for above string, And for this derivation we will make use of to; Step 1: The first leftmost leaf of the parse tree matches with the first input symbol. Hence we will advance the input pointer. The next leaf node is P. We have to expand the node P. After expansion we get the node y which matches with the input symbol y. Step 2: Now the next node is w= which is not matching with the input symbol. Hence we go back to see whether there is another alternative of P. The another alternative for P is y which matches with current input symbol. And thus we could produce a successful parse tree for given input string. ‘Syntax Analysis grammar deriving the given input p down approach. Fig, 3.5.4 (a) Fig. 3.5.1 (b) Scanned with CamScanner stop 3: ne We halt and_ declare that ated sutecesstll) : completed sucesso seo at In top-down Pa Fig. 3.5.1 (c) js very important task, ‘ gues based on trial and error (ec or acct a parila FUE an producing the correct input string the to backtrack and production. This process has to fre get the correct input string: the productions if we found every P unsvitable for the string ™ in that case the parse tree cannot [ERI Problems wit ‘There ate certain problems ‘we need to eliminate these pro them. atch then i be built. h Top-down Parsing down parsing: In © ss these P) in top~ der to implement the parsin lems. Let us discus roblems and how to saa! 4. Backtracking Backtré i ii - jeckrckg is a technique in which for expansion ative and if some mismatch occurs then we of non-terminal symbol we ch try another alternative if any. os For example : S— xPz Poywly Then the eae mn gtammar. Seanned with CamScanner 3-17 Syntax Analysis Compiter D259” 2. Left recursion - . , ‘The left recursive grammar is a grammar which is as given below. 1¢ lef AL Aa Here #, means deriving the input in one or more steps. The A can be non-terminal ere 2, @ denotes some input string. If left recursion is present in the grammar then it a serious problems. Because of left recursion the top-down parser can enter in oe . . infinite loop. This is as shown in the Fig. 35.3 Fig. 3.5.3 Left recursion Thus expansion of A causes ly and due to generation of A, Aa, Auc, Agaa, ..., the input poi i in top-down parsing and therefore elimination of left recursion is a To eliminate left recursion we need to modify the Srammar having a production rule with left recursion. must. grammar. Let, G be a context free A>Aa AB } +85.) Then we eliminate left Tecursion by re-writing the production rule as : Apa’ A’ saa’ +852) A’se Thus a new symbol A’ is introduced, We can also verify, whether the modified Srammar is equivalent to original or not. TECHNICAL PUBLICATIONS”- An up thrust fr knowledge Scanned with CamScanner Syntax Compiler Design Bate “eh Fig. 3.5.4 For example : Consider the grammar ESE+T|T We can map this grammar with the rule A — A |B. Then using equation (2) we can say, B=T Then the rule becomes, ETE E+TE’ |e Similarly for the rule. TOT*F|F TECHNICAL Put. ~ Scanned with CamScanner compiler Dost 3:8 Syntax Analysis We can eliminate left recursion as TOF Torr le The grammar for arithmetic expression can be equivalently written as - EOE+TI|T ESTE ES4+TE |e T Fr’ ToOT*F|E => Torr fe F>() | id = F>() | id Consider the following grammar A= ABd|Aala Bo Belb remove left recursion Solution : Consider the rule, A— ABd|Aala We map this grammar with the rule A+ A «|B | ><> te =-[2 is This can be eliminated by re-writing the production rule as : A Ba’ A = aA’ Au > GA’ = A’ 5 BAA’ Ao oe Ai’ se For As Aala L$ oobdy A AaB This left recursion can be eliminated as - A Ba’ A = ad’ Av + GA’ = A’ & ad’! Ao oe Ao oe a Scanned with CamScanner 3-20 Compiter De: To summarize a’ a AC A> ABd/Aala ne > RaANaat Ac ons B + Belb | = aap. We map this grammar with the nile A~* At if B > Belb A AaB use the rule A +fA’, AGA Bo bB Bo eB Bowe To summarize, the grammar without left recursion will be Awad’ A’ Bd Alaa’ A’ oe Bobep Boe BY Boe 3. Left factoring If the grammar is left factored then it becomes sui the table f . Basically factoring is used when it is not clear that which of the we | oe ao e expand the non-terminal. By left factoring we may be able to i > pa ction # which the decision can be deferred until enough of the j Pena ine : dics. input is seen to make the rif In general if A > af |oB2 is a production then it is not possible for us to rule or second. In such a situation the above Aw>GA" take a decision whether to choose Srammar can be left factored as Scanned with CamScanner compiler D081" saat Syntax Analysis A> BilB2 ror example : Consider the following, grammar. § J iEtS | iEtSeS | a E>b The left factored grammar becomes, § iEtSS'|a g Sle Eb Do left factoring in the following grammar - A aAB\aA |a B— bB|b Solution : If the rule is A > « B,|aB| left factored. Consider is a production then the grammar needs to be A+aAB ta L 8 C = a t a ge Pel We have to convert it to AwaA AnwaA’ => A’BiB2 A’ ABI Ale Similarly Bob Blb Bo be bouga => Boeke A a By a Be To summarize, the grammar with left factor operation will be - A->aA’ A’ ABIAle Bo bB Bs Ble Scanned with CamScanner 4, Ambiguity ° The ambiguous grammar is not desirable in top-down parsin, he ambiguity from the grammar if it is present. remove the ambiguity from the grammar if iti For example: E+ E4E | Ee E fid i e will design the parse tree for id aid... is an ambiguous grammar, We will design the parse tree for id + id « ig * fom, | (2) Parse tree 1 (0) Parse tree 2 Fig. 3.5.5 Ambiguous grammar For removing the ambiguity we will apply one rule : If the grammar has lt Sssociative operator (such as +, -, *, /) then induce the left recursion and if te Srammar has right associative operator (exponential operator) then induce the right recursion. The unambiguous grammar is E+E+T E~T ToT*F ToF F id Note one thing that the grammar ig unambiguous but it is left recursive oa elimination of such left recursion is again a must. Scanned with CamScanner semper Ue 2 o Syntax Ana Top-down parsors Backtracking Prodictive parsers Recursive descent Fig. 3.5.6 Types top-down parsors There are two types by which the top-down p; 1. Backtracking arsing can be performed, 2, Predictive parsing A backtracking parser will try different input string by backtracking each time. The backtracking is powerful than predictive parsing. But this technique is slower and it Tequires exponential time in general. Hence backtracking is not preferred for practical compilers, Production rules to find the match for the As the name suggests the predictive Parser tries to predict the next construction using one or more lookahead symbols from input string, There are two types of predictive parsers : 1. Recursive descent 2. LL(1) parser let us discuss these types along with some examples, EG Recursive Descent Parser A parser that uses collection of recursive procedures for parsing the given input String is called Recursive Descent (RD) parser. In this type of parser the CFG is used to build the recursive routines. The R.HS. of the production rule is directly converted to a Program. For each not n-terminal a separate procedure is written and body of the Procedure (code) is RHS, of the corresponding non-terminal, Basle steps for construction of RD parser The RHS. of the rule is directly converted into program code symbol by symbol. LAE the iny Put symbol is non-terminal then a call to the procedure corresponding to the non-terminal is made. * a 'NPUt symbol is terminal then it is matched with the lookahead from input. eI Sokahead pointer has to be advanced on matching of the input symbol TECHNICAL PUBLICATIONS”. An up thrust fr knowledge Scanned with CamScanner ————— la, I alternates then all these alternates , 3.1f the production rule has- many ni roceclUire. ty combined into a single body of proc ; 4. The parser should be activated by a procedure corresponding, to the stay “7 Let us take one example to understand the construction of RD parser, Con grammar having start symbol E. > num T T > *numT | procedure E if lookahead = num then match(num) ; T /* call to procedure T */ } else error; if lookahead = $ { declare success; /* Return on success */ else error; "end of procedure E*/ procedure T if lookahead = '*’ match('*"); if lookahead ='num’ match(num); T; } /* inner if closed*/ else enor Ls Nee /* outer if closed*/ combined into same procedwe ee “Here the other altemate is } 7" ond of procedure Tr, procedure match( token t ) ae if lookahead=t lookahead = next_token; /ockahead pointer is advanced*/ " ~ TECHNICAL PUBLICATIONS". An on Scanned with CamSecanner = \ =o ee Compiler Design 3-25 ‘Syntax Analysis else error } /*end of procedure match*/ procedure error print("Errorll!"); } /*end of procedure error*/ Fig. 3.6.1 Pseudo code for recursive descent parser The parser constructed above is recursive descent parser for the given grammar. We can see from above code that the procedures for the non-terminals are written and the procedure body is simply mapping of RH.S. of the corresponding rule Consider that the input string is 3«4, We will parse this string using above given procedures. GEEIs E> numT The parser will be activated by calling procedure E, Since the first input GEE character 3 is matching with num the T procedure match (num) will be invoked and then the lookahead will point to next token. And a call to procedure T is given. T+ numT A match with “ is found hence ELLs} r + numT lookahead = next_token. t Now ‘4’ is matching with num hence again the procedure for match (num) is EEE] Declare success ! fulfilled. Then procedure for T is invoked. t And T is replaced by e. As lookahead pointer points to $ we quit by reporting success. Thus the input string can be parsed using recursive descent parser, Construction of recursive descent parser is easy. However the programming language that we choose for RD parser should support recursion. The internal details are not accessible in this type of parser. For instance; in this parser we cannot access the current leftmost sentential form. Secondly at any instant, we cannot access the stack containing Tecursive calls. TECHNICAL PUBLICATIONS”. An up thrist for knowledge Scanned with CamScanner a Syntay. Any mmar (A of Bz) is crucial. Ang, cursive gt ao We ave 1 left factor the grammar. We canna) first WS \a : a ns : : all types context free grammar. si owing grammar Writing procedures such procedure simple ent parsers for consider the fo ToveTly void , Write down procedures for nontermin mmmar to make @ recursi uals of the gra 2 WE descoy parser. Solution : Procedure E() { TOK if (lookahead = '+') match('+'); EQ): } else error ( ); if (lookahead = '$)) { declare SUCCESS ;} Procedure T ( ) vO. if (lookahead = '*') { match ('*'); TOs t else error ( ); } Procedure V () { if (lookahead = ‘id’) match (‘id’); else error ( ); ‘ \ Procedure match (token t) t if (lookahead = t) lookahead = next_token; else Seared with CamScarmer ‘Compiler Design 3-27 Syntax Analysis error; i adut ert) { Print (" Error !"); } Advantages of recursive descent parser 1. Recursive descent parsers are simple to build. 2. Recursive descent parsers can be constructed with the help of parse tree. Limitations of recursive descent parser LOR 1. Recursive descent parsers are not very efficient as compared to other parsing techniques. 2. There are chances that the program for RD Parser may enter in an infinite loop for some input. 3. Recursive descent parser can not provide good error messaging, 4. It is difficult to parse the string if lookahead symbol is arbitrarily long. Eid L(1) Grammars This top-down parsing algorithm is of non-recursive type. In this type of parsing a table is built. For LL(1) - The first L means the input is scanned from left to right. The second L means it uses leftmost derivation for input string. And the number 1 in the input symbol means it uses only one input symbol (lookahead) to predict the parsing process. The simple block diagram for LL(1) parser is as given below. LL(1) Parser Output Parsing table Fig. 3.7.1 Model for LL(1) parser TECHNICAL PUBLICATIONS”. An up thrust for knowledge Scanned with CamScanner | 3-28 8 ‘ “ye fer i) Sick ity pn et put buffer ii) Stack iti) Parsing , 1 atures use LLL) are i) The data structures used by LL(1) The stack is um ah a 4 or to store the LL(1) parser uses input buffer OU ecard eat Joft gentential form.The symbols in RES. of p stack jt Le. from right to left. Thus use of stack makes this algorithm non-recyr., order, Lev Brom aoe ey array, ‘The table has row for non-op.’ Th table is basically a two dimensional array. be has row for nonemig omn for terminals, The table can be represented as MIA] where A is a none, rl Mins and a is current input symbol. The Compllar Dosign input token’ parser works as follows - the stack and a current input symbol}, Wi ; ee f ‘The parsing program reads top 0 a ; at ee action is determined. The parsing actions th help of these two symbols the parsing 2h be Parsing action Top Input token 7 __ Parsing oe a A rs) Pop nd advai lookahead to next token. A be a Refer table MiAva) if entry at MIA,a) is error report Error. | a Refer table M[A.a] if entry at M[A,a] is A> PQR then pop , A then push R, then push Q, then push P. j The parser consults the table M[A,a] each time while taking the parsing actions hene this type of parsing method is called table driven parsing algorithm. The configuratix of LL(1) parser can be defined by top of the stack and a lookahead token. One by ae configuration is performed and the input is successfully parsed if the parser reaches te halting configuration. When the stack is empty and next token is $ then it correspon’ to successful parse. [EJ] Non Recursive Predictive Parsing The construction of predictive LL(1) parser is based on two v i tant functio® and those are FIRST and FOLLOW. ery important For construction of Predictive LL(1) parser we have to follow the following steps~ 1. Computation of FIRST and FOLLOW function. 2. Construct the Predictive Parsing Table using FIRST and FOLLOW functions. 3. Parse the input string with the help of Predictive Parsing table. FIRST function FIRST(a) is a set of terminal symbols ¢h derivation of a. If a=9€ then © is also in FIRST () hat are first symbols appearing at RUS" Scanned with CamScanner 3-23 Syntax Analysis are the rules used to compute the FIRST functions. If the terminal symbol a the FIRST(a) = fa}, there is a rule X—+e then FIRST(X) = £), Lott Xj, FIRST(A) = (FIRST FIRST(X,) U FIRSTOG)... vu FIRSTOG). = Ou 0%). Where k X; Sn such that 1 (E) id. - + T+ FT —= replace Fby RHS. nile FIRSI(T) = FIRST(F) } } T>@)7 | iaT F>@) : Foid 1 This can - Oss FIRST(E) = FIRST(T) = FIRST(F) = (uid | | Peacded in TEGHUICAL PUBLICATIONS”. An up trast for koowtedge Scanned with CamScanner § 3-30 YN FIRST(E’) = (+, ©) As Ef > ATE’ - Efe by referring computation Fue HS. of production rule for E’ is added in the Fis ae 7 The first symbol appearing at function. FIRST(T’) = (9) As Tot TE’ Boe earing at RHS. of production rule for T’ is addeg : . inal symbol aj The first terminal symbol appensing MT Oy function. the FIRST function. Now we will comput FOLLOW(E) - i) As there is a rule F->(E) the symbol ‘)’ appea! Hence ‘)' will be in FOLLOWE). ii) The computation rule is A > 0B} a=(, B=EB=). rs immediately to the right of E, B we can map this rule with F — (E) then A=; FOLLOW(B) = FIRST(®) = FIRST() ) = {) } FOLLOW(E) = { )} Since E is a start symbol, add $ to follow of E. Hence FOLLOW(E) = { )S) FOLLOWE) - i) E> TE’ the computational rule is A >a BB. A=E,a= = E, B= a = T, B = E, B=e then by computational rule 3 everything = FOLLOW(A) is in FOLLOW(B) i. everything i Le. everything in FOL! ni FOLLOWE’) = {), $] LOW(E) is in FOLLOWE)- 4i) E'> +TE’ the computational rule is A +. BB, . A=E,a=4T,B=£,8 ue ’ Bee t FOLLOW(A) is in FOLLOW(B) ie oe : ; i FOLLOW(E’) = ( )$ } q We can observe in the given grammar th 'Y computational rule 3 everything * 8 in FOLLOW(E’) is in FOLLOW(E} at ) is FOLLOW(T) - Lerealy following E. We have to observe two rules E > TE Scanned with CamScanner ier Design x 3-31 Syntax Analysis E’ > +TE’ i) Consider E> TE’ we will map it with Aa BB A=E, a=, B=T,=B=E' by computational rule 2 FOLLOW(B) = (FIRST (f) ~ ec}. That is FOLLOW(T) = (FIRST(E’) ~ e} = {fe}-e} = t+} ii) Consider E'> +TE’ we will map it with A >a BB A =F, a = 4B T, B FOLLOW(A) = FOLLOW(B) ie. FOLLOW(E’) FOLLOW(T) = {)S) Finally FOLLOW(T)= {+} U ()$} = {+ ),$) We can observe in the given grammar that + and ) are really following T. E’ by computational rule 3 ;OLLOW(T) FOLLOW(T’) - To Fr We will map this rule with AaB then A=T, a=F, B=T’, B=e then FOLLOW(I’) = FOLLOW(T) = {+,)8} TO FT’ We will map this rule with A—aBB then A=T, a=*F, B=T B= e’ then FOLLOW(T’) = FOLLOW(T) = {+,),$} Hence FOLLOW(T’) = {+,),$} FOLLOW(F) - Consider T -> FT’ or T’ > *FI’ then by computational rule 2, (tor TFT | | AsaBp Asap | A=Ta=eB=Rp=T A=T,a=*,B=RP=T | FOLLOW(B) = (FIRST (B) - €] FOLLOW() = (FIRST) - } FOLLOW(F) = [FIRST(T) - ¢} FOLLOW() = (FIRST(T’) - e} _ FOLLOW) = F} FOLLOW() = (1 TECHNICAL PUBLICATIONS”. An up thrust for Anowiedga Scanned with CamScanner FOLLOW(A) = FOLLOW(T) = FOLLOW(F) | | Hence FOLLOWE +S) | Finally FOLLOW(F)= Ut) FOLLOW(F) = {+%)5) To summarize above computation Symbols FIRST FOLLOW E (id) 0S) 2a Wee 7 {ee} os | (Gid} Algorithm for predictive parsing table - ‘The construction of predictive parsing table is an important activity in predicit} , parsing method. This algorithm requires FIRST and FOLLOW functions. ; Input : The Context Free Grammar G. Output : Predictive Parsing table M. Algorithm : For the rule A >a of grammar G 1. For each a in FIRST (a) create entry ry MA] = A cathe : 2. For ein FIRST(a) create entry MIA,b] = A “01 where a is terminal sym? b]= Ao Where b is the symbols from FOLLOW(A) TECHNICAL PUBLICATIONS. An up gy Up thrust for krowiedgo Scanned with CamScanner Design 3-39 a Syntax Analysis gif is in FIRST(@) and $ is in FOL MIA, $]= A >a. 4, All the remaining entries in the table M are LOW(A) then create entry in the table marked as SYNTAX ERROR. Now we will make use of the above algorithm to create the parsing table for the grammar ~ E> TE’ E> 4TE’e T- FI’ T 3 FT' |e F> @lid First create the table as follows, F Now we will fil up the entries in the table using the above given algorithm. For that consider each rule one by one. ESTE Axa A=Ea=TE FIRST(TE’) if E’ = e then FIRST(T) = ((jid} MIE, (| = ESTE’ MIE, id] = Ere’ Es stp Asa TECHNICAL PUBLICATIONS” An up thrust for knowiedgo Scanned with CamScanner ad Val pow “se lea] = B-> +TP" Hence Boe Av AzE,aze then FOLLOW(E’) = ()S) ME’, I= Ee MIE, $] = Ee ToFT Asa A=F,a=FI’ FIRST(FT’) = FIRST(F) = ((id) Hence M[F,(] = T — FT’ And MIRid] = T <5 pT’ T srr A aa AsTa= FIRSICET) = (4 Hence M[T,9] = Topp T se A sa A=T ane FOLLOW(ry ~ Hence Mtr” 4 MIM) =p, g MIT’) )$) Scanned with CamScanner comple Desion F 7() Ao A=Fa=(E) FIRST(E)) = (1 Hence MIF,(J = F(E) F >id A 3a A=F,a=id FIRST(id) = { id } Hence M[F,id] = F — id The complete table can be as shown below. Syntax Analysis Now the input string id + id * id $ can be parsed using above table. At the initial configuration the stack will contain start symbol E, in the input buffer input string is placed. Stack Input Action SE id + id + id $ Ao symbol E is at top of the stack and input pointer is at first referred. This entry tells us E — TE’, so we will push E’ first then T. id, hence M[E,id] is Stack Input __Action | SeE’'T id + ids id S E> TE id + id * id $ E- FT id +id*idS Foid | +ideid S TECHNICAL ‘PUBLICATIONS™- An up thrust for kowledae Scanned with CamScanner Thus it is scanned from left to right and we always fol, eft most derivation while parsing the input string. Also at a time only one input symiy js referred to taking the parsing action. Hence the name of this parser is LL(1), th LL(1) Parser is a table driven predictive parser. The left recursion and ambiguoy grammar is not allowed for LL(1) parser. EEDELE show that folowing grammar : observed that the input is '$ — AaAb| BbBa Antje Boe is LL (D. Solution : Consider the grammar : S— AaAb $— BbBa Ave Boe Now we will compute FIRST and FOLLOW functions. ' FIRST(S) = (a, b) if we put $— AaAb S—aAb When Ae Also S > BbBa $—bBa When Be FIRST(A) = FIRST(B) = (e) Scanned with ee a Lyrisen e inng ee Design comple poLLows) (sh POLLOW(A) = FOLLOW(B) = {a, b} athe LLC) parsing table is a b $ S— AaAb S$ BbBa Ave Now consider the string "ba". For parsing - Stack Input Action $s bas S— BbBa “sabbB ba$ Boe ba a » Sab : = aS Bort Po ae ee 3s | = 5 ng $ Accept This shows that the given grammar is LL(1) For the following grammar find FIRST and FOLLOW sets for each of non terminal. S—aAB|DA |e AsaAble BoB |e Solution: FIRST(S) = The first terminal symbol appearing on R.H.S. FIRSTS) = {a,b,e} FIRST(A) = The first terminal symbol appearing on RES. FIRST(A) = fa, e} FIRST(B) = The first terminal symbol appearing on RH'S. of production tule for B. TerHninAt BURL CATIONS” An up thrust for knowndg® Scanned with CamScanner 8 PBST 6 se} ey Now, we will eomprte FOLLOW finetion follows: VOLLOW EO) © {8} se Sv aeatart xymbol, POLLOW (AD {»} Iecause Consider the rule, A a BA whieh can be mapped with A 9 a AB Then the FIRST()) = FIRST(B) = {be} Lid «a Bp Then without © remains b, Henee b ¢ FOLLOW(A). Now consider the rule 4 then “everything in oy FOLLEW(A) = FOLLOW(B). This rule can be mapped with S564 J vy <. everything, in FOLLOW(S) = FOLLOW(A) A a B FOLLOW(A) = {$} To summarize — FOLLOW(A) " {b,$} Now consider the rule, S 7a AB Ve! ifwemap. A> @ B everything, in FOLLOW(S) = FOLLOW(B) FOLLOW(B) = {$}. | = Then according to this rule, | HIRSICA) = fase} | FOLLOW(A) = {b,S} i | = | SST) © {box} | FOLLOW(B) | j - | HHT) © farbye} | FOLLOWG) = {8} po Ss Cas Scanned with CamScanner Syntax Analysis Solution : Let the given grammar will be, | | . & ba) are terminal S + iCiSAla A+ je Cab Now we will compute FIRST and FOLLOW for given nonterminals. FIRST(S) = {ia} FOLLOWG) = {e,} | FIRST(A) = {e,e} FOLLOW(A) = {e, S} FIRST(C) = {b} FOLLOW() = {t} The predictive parsing table will be | a b e t i s Ls Soa S>iCtsa [a ASS Ane | Ase c Cab As we have got multiple entries in M[A,e] given grammar is not LL(1) grammar. The (i, t, e, a, b) are the terminal symbols because they do not derive any production rule. Consider the grammar texp > atom | list atom + number | identifier list — (textp-seg) fextp-seg texp, textp-seg | textp ) Left factor this grammar. 4) Construction FIRST and FOLLOW sets for the nonterminals. iti) Show that resulting grammar is LL(1). 4) Construct LL(1) parsing table for the resulting grammar. TECHINIGAL PUBLICATIONS- An up thrust for knowledge Scanned with CamScanner ~~ yr 9-40 . wh, compior Desir st out the set of terminals and TOR; Ma ston: For te gen OMIM eit ; first Set of terminals ie = [nw stp, 207 ih textp-seg] Vv sat of nonterials i: ainst A208 108 2- it ag’ +) Consider the rule and map 4 i) Consider th texip-seg|txP L L texip-seg— tex'P, t J hh a Bie. e) A a The grammar after left factorization will be - Aas’ A’ > Bil Ba b-~ { 2. textp-seg — texp textp-seg’ textp-seg’ — , textp-seg |e ii) After left factoring, all the production rules can be enlisted as below - textp — atom |list atom —> number | identifier list — (textp-seg) texp-seg — textp textp-seg’ textp-seg’ — , textp-seg |e Now for each non-terminal ee inal symbol the FIRST and FOLLOW could be computed FIRST(textp) = First terminal s Bis 'ymbol appearing on RS. for the rule ‘etp > atom — number| identifier +. Number and identifier is added in FIR: > list > (textp. seg) pees * (Will be in FIRg: T(textp)) * FIRST(textp) = ( (, number, identifier) FIRST(atom) = Fj Eat terminal symbo} @ppearing at R H, : S. of rule for atom — number identifier aaa ane Scanned with CamScanner saat Syntax Analysis ‘compiter Design FIRST(atom) = (number, identifier} : FIRST(ist) = ((1 RST(textp-seg)= The first symbol appearing at R.HS. of rule for textp-seg. Fl textp-seg — toxtp textp-seg’ atom | ist aes v number | identifier | (textp-seg) FIRST(textp-seg) = {number, identifier, ( }. FIRST(textp-seg’) = {¢,£}. Now we will compute FOLLOW. FOLLOW(texp) = i) As textp is a start symbol add § to it. textp-seg — textp textp-seg’ textp , textp-seg tT , is added in FOLLOW. FOLLOW(textp) = {$,/} FOLLOW(atom) = FOLLOW(textp) because textp — atom. FOLLOW atom) = {g,} FOLLOWS(ist) = FOLLOW(textp) because textp — list. FOLLOW(ist) = {S,} FOLLOW(textp-seg) = i) list > (textp-seg) + ie. ) follows textp-seg ++) is added in FOLLOW * FOLLOW (text-seg) = ( ) } FOLLOW(textp-seg') = i) As there exists a rule if A oB then everything in FOLLOW(A) is in FOLLOW(B). textp-seg — textp textp-seg’ L L t A a B TECHNICAL PUBLICATIONS". An up thrust for knowleigo Scanned with CamScanner rest) = 1 number, identificr } FARST(atom) = {number identifier | roLLow(textp-see) =) FOLLOW(textp-seg) = {) } rarstilis) = 10) ae F pagsttextp-seg) = (¢ numbers identifier) rinsttestpseg) = {78} iv) The predictive parsing table can be constructed as follows - i number identifier A ( ) ? = en Gieeses textp (list) | textp textp> atom —textp> atom Ze Pipsestieeese ee A | atom atom atom | number identifier ecaecearrenaens oO ae ; list> Pe textp-seg —textp-seg—> text We ats pseg > textp a ree ; textp-seg> textp-seg’ yas textp LCI seg | eee ray textp-seg’ textp-seg' textp-seg’ ae textp-seg’ > , textp-seg 0 As each cell in th above table contains unique entry, th , the given grammar is LL(1). the following grammar. Construct the predictive parser for S3Wla LL ss Solution : $ the given gramma: = i As 8 e “cursive because hi ft ve becay use of LoL s|s, We will first eliminat > B car ‘onverter eliminate das fe left Fecursion. Ag 4 | Ag IB can be ¢ te Scanned with CamScanner compiler Desi” 3-43 ‘Syntax Analysis We can write L > by 818 a8 Lost!’ p> SL Now the grammar taken for predictive parsing, is - sola L— SL’ Lo, Sh’ |e Now we will compute FIRST and FOLLOW of non-terminals FIRSTS) = (a) FIRST(L) (Ca) FIRST(L’) = {', €) FOLLOW(S) = {’,), $} FOLLOW(L) = FOLLOW(L’ = [ )} The predictive parsing table can be constructed as s Sva ss (l) L L3Sl’ LoS’ oe | L/, SL’ Example: Construct the behaviour of the parser on the sentence (a, a) using the grammar specified above. Solution : As we have constructed a predictive parsing table in Example 38.6 we will Parse the string (a, a) using that table as shown below. TECHHICAL PUBLICATIONS” An up thrust fr owled?e Scanned with CamScanner Construct predictive parsing table for the grammar E>E+T|T, T-TF|F, FF*alb. it Razr Ly Let EE4T|T, TOTF|F FoFlalb be the given grammar. Solution Step 1: Eliminate left recursion - The formula to eliminate left recursion is A>Aa|B = ABA‘ 1) ESEsT|T . Lo 1 => ESTE A Aa § ES+TE|e 2) ToTF|F dig => Tr Aa B Toke A> aA’le 3. FoF jalb = Poor pr Fo 'Ple ESTE’ EF’ +TE’Je TFT TFT Je STS ete ee Fak [BF FoF le vw we will compute FIRST and FOLLOW No AS FoaF'|bF First symbol on rhs. FIRST(E) = FIRST(T) = FIRST(F) FiRST(E) = {a,b} FIRST(T) = {a/D} FIRST(F) = {a,b} VE +TEle First symbol on rhs. FIRSI(E’) = {+e} FIRST(T’) = {a,b,e} Then ToaFT|bFT |e First symbol on rhs. because T’-+ FT’ if we replace F by aF’ and bF’ FIRSI(T’) = {a,b,e} FIRST(F’) = {*,2} Now we will compute FOLLOW. FOLLOW() FOLLOW(E) = {$} > Start symbol FOLLOW (E’) As wn ETE’ is a production which can be mapped against A —>0B. According to is rule “verything in FOLLOW(A) is in FOLLOW(B). Hence FOLLOW(E’) = {$} FOLLOW(T) Es 4TE’ 47+ TE’ FOLLOW (1) = {+} ie. + FOLLOWS T. As ESTE TECHNICAL PUBLICATIONS”. An up thrust for knowledge : Scanned with CamScanner Syntax, A ee hens =, FOLLOW(E) = FOLLOW) . $is added in FOLLOW() pouowm = (Sh FOLLOW(T’) ‘The rule TFT resembles AaB. Then FOLLOW(A) = FOLLOWG®) . FOLLOW() = FOLLOW) Follow(F) > TPT FRT FFT | FOF aand b comes after F. « FOLLOW) = (+ $} . FOLLOW@) = {a,b} Torr i i i i ee ae A >oBB. According to this rule everything ix = FOLLOW(I’) = FOLLOW(F) Hence FOLLOW() = {a,b,+, $} FOLLOW (F’) Fa’ is one rul i = we rule which can be mapped with A — aBB. According this rule we FOLLOW() = FOLLOW(F’) + FOLLOW(F) = {a,b,+, $} The predictive parsing table can be constru eannaa cted as follows - Hence we will add a rule ETE’ j MIE, a] and M [E, b] 7 FIRST(I) = {a,b} MIT, a] = % a < x TOPT’ ond MT, by = sla Torr MIF, a] = F- ar’ and_ MIF, by =p TECHNICAL Pusy inn = Scanned with CamScanner a a Syntax Analysis FIRST) = {+e} M(E‘+] = E’— 4TE’ fe is in FIRST(E? and FOLLOW(E) MIE‘S] = E’>e FIRST’) = {1°} MIF") = F "Fr" )={$} then Hei in FIRST (F?) and FOLLOW (Fy = { a,b, +, $} then M{F7 a] = M [Fb] = M[F’,4] = MIF, SJ= Pe Hence predictive parsing table will be a + 5 a el pee E ESTE }—£___EoTe [So | eB ESTE Ese | Torr TFT Toe Toe Toe Toe Far F- br Poe Poe Poe PoP Poe | Compute FIRST and FOLLOW sets Sor all nonterminals in the following Ov grammar $+ Aa|bAc|Be| bBa And Bod Solon: FIRST(S) = First terminal symbol appearing on RHS, = {bd} FIRST(A) = First terminal symbol appearing on RHS, = {a} FIRST(B) = First terminal symbol appearing on RHS. = {a} OLLow(s) _ {3} ‘“s Start Symbol. Rsider the ru Scanned with CamScanner Compiler Design SPAT with =e A BB oak Hence FOLLOW(A) = FIRST(B) = FIRST a) = (oh sath Similarly $= bAc Which can be mapped with 4 imilarly S> A ett A aBB ‘Then FOLLOW(A) = FIRST (8) = FIRST(O) = {od FOLLOW(A) = {2 ¢} Consider the rule, S>bBa youd A aBB FOLLOW(B) = FIRST(S) = FIRST(a) = {a} $3Be todd with a =e A BB FOLLOW(B) = FIRST(®) = FIRST(¢) = {e} :. FOLLOW(B) = {a,c} To summerize, ¥ acceptable but These strategies are given in detail as below i) Panic mode * This strategy is used by most parsing Ae s. * This is simple to implement, TECHNICAL PUBLICATIONS™ Scanned with CamScanner ‘Syntax Analysis in this method on SCOVeTiNN, eFrOF, the parser dia in this : ime, THis process is continued Until one of ac titene is found. Synchronizing, tokens are Frese tokens indicate an end of input state ards input symbol one at a nated set of aynchronizing haa semicolon oF end delimiters: 9 ment, Thus in panic mode recovery a considerable amount of input is skipped witheut * pecking it for additional errors, hocking This method guarantees NOt to go in infinite loop, If there ate Tess number of errors in the Same statement then this strategy is a est choice. jy Phrase level recovery il oo + In this method, on discovering error parser perform local correction on temaining, input ‘« Itcan replace a prefix of Temaining, input by some string. This actually helps the parser to continue its job. The local correction can be replacing comma by semicolon, deletion of extra semicolons or inserting missing semicolon; The type of local correction is decidedt by compiler designer. + While doing the replacement a care should be taken for not going in an infinite loop. + Thi is method is used in many error-repairing compilers. + The drawback of this method is it finds difficult to handle the situations where the sctual error has occurred before the point of detection. ii Error production * Hwe have a knowledge of common errors that can be encountered then we “corporate these errors by augmenting language with error productions that generat can the grammar of the corresponding, le the erroneous construc! ‘ror production is used then during Pi ‘message and parsing can be continued. * This method is extrer arsing, We can generate appropriate error mely difficult to maintain. Because if we change the granvmar then it becomes necessary to change the corresponding error productions, Global production , We often want such Meorrect input string. * We expect less number rom troneous @ compiler that makes very few changes int processing an of insertions ens ver deletions, and changes of tokens to recove Scanned with CamScanner Compiler Design 35.52 Sia, and space requirements at parsing time rease time theoretical concept. hus simply 4 Predictive Parsing Such methods ine! Global production is # [EEE Error Recovery in cted during the predictive parsing when the terminal o, toy « Anerror is dete: stack does not match the next input symbol, or when nonterminal A on ce stack, a is the next input symbol, and parsing table entry M[A,a] is ered ug « panie-mode error recovery #s tased on the idea of skipping symbols on ihe ‘a token in a selected set of synchronizing tokens. ig Following are some ways by which the synchronizing set can be chosen - place all symbols in FOLLOW(A) into the synchronizing set for nontem If we skip tokens until an element of FOLLOW(A) is seen and pop Afat & stack, it likely that parsing can continue. ywords that begins statements until to the synchronizing sets for, We might add ke nonterminals generating expressions. IRST(A) to le to resume parsing a » We can add symbols in F the synchronizing set for non termina , Due to this it may be possi cording to A if a symbdt FIRST(A) appears in the input. » if a terminal on top of stack cannot be matched, a simple idea is to pop t terminal, issue a message saying that the terminal was inserted. then the production desig ror detection, but cam number of nontermix! S 1 can generate the empty string, This may postpone some e d. This approach reduces the yr recovery. If a nonterminal can be used as a default. cause an error to be misse that have to be considered during erro a Consider following parsing table. F | Non id + C ( ) $ | terminals synch Toe To *FT’ Foid synch synch —-F-9(E)___ synch FOLLOW st” #| e Synch indicates the synchronising tokens obtained from terminal. The FOLLOW set for the given grammar is Scanned with CamScanner 3-51 syntax Analysis pier Desion zs FOLLOW (FE) = FOLLOW (E’)={, 5} FOLLOW (T) = FOLLOW (1’) = {+,), $} FOLLOW (F) = {+,.*,), «If parser looks up entry M [A, a] as blank then the input symbol a is skipped. + Ifentry is synch then the non terminal on the top of the stack is popped. « If token on the top of the stack does not match the input symbol then we pop the taken from the stack. Stack Input Comments |. $B +id** id § skips | SE id** id $ $ET id**id $ SE T'F id**id$ Ri $E'T’id id** id $ SET ttids | seTE* tid $ SET'F *idS Error. M [F, *] = synch. s:pop F SET’ rid s Ee SETF* "id S Explai ing lexi ‘plain the reasons for separating lexical analysis phase from syntax analysis. TNTEEE ee Ans, ; Refer section 3.1.2. ee Scanned with CamScanner Compiler Design sae - iminate iguits, Q2 — What is ambiguous grammar? Eliminate ambiguities 7-~ is E-E+E|E*E|(E)lid- ree Ans. : Ambiguous grammar : Refer section Elimination of ambiguity : Refer section 3.5.1(4). Q3 Consider the following grammar, S0A] 1B | 01 A305 | 1B |1 B-0A|1S Construct leftmost derivations and parse tree for the following Senter, i) 0101 ii) 1100101 Cee RT <2 Lp Ans.: Refer example 334 Q4 — Construct predictive parsing table for the following grammar, ETE’ E’34TE|c TFT’ T oF |e F>(E)|id. | ee 5, Set As epti 06. Seb Ss Markell 6: Janae 101 5<01, Karey Ans. : Refer example 3.8.1. 1 Q5 What is recursive descent parser ? Construct a recursive descent parser forth following grammar, i EEsT|T | TOTFIE FoF lalb. Ans. : Recursive descent parser : Refer section 1.6 The given grammar is ESET |T TTF |F Fa |b. Which is a left recursive ‘escent parser. The rule to eliminate left | | | | Srammar. We will eliminate this left recursion befor 1 { | | | ] i! recursion is H ASAGB then A spay A’ SoA Je Consider Lo ESEsT| 7 eet i ESTE A-Aa 6 ES+TE le TECHICAL PUBLICATIONS". Ane tent bien Scanned with CamScanner ort = TOF 2 cae TOF e A poral = Pab’|br 3 A AGBB PF e qo summarize, EoT ET TE" |e TFT Tort |e Fak’ |bF” Fo ‘Fle Now the recursive desent parser is Procedure E() { TO); Edash ( ); } Procedure Edash ( ) { if (lookhead = '+') match ('+'); TC) Edash ( ); } else null; [eondure 1) F(); ) Tdash ( ); cae ‘Tdash () if (true) {F(); ‘Tdash ( ); return; else null; TECHNICAL PUBLIGATIONS”- An up thrust for knowiestie Scanned with CamScanner Consider the following grammar, EoT+E|T pow tv | ta the procedures for the non-terminals of the grammar to mate ¢ Write down recursive descent parse? as rey Ans. : Refer example 3.6.1 | Q7 — Construction of predictive parsing table for the following grammar. | EsE:T|T, ToT+F|F, FF*|alb. | Nie ae ee eee Ans. : Refer example 3.88. | Q8 = What is an LL(1) grammar? Can you convert ever context free grammar into LL)? a | i : wo ano: The LL() grammar is a kind of grammar in which the inpt! m left to right. It uses the le ‘ivati it i only one input symbo (ockahead) to ac ge waton for the input string and w=) entries in each row of arse fo Dredict the parsing process. There are no mult LL) Grammar. Pasing table designed for parsing the input usin The context free grammar | recursive and if it j | if it is unambi iguous gramm; ar, | can be converted to LLG) grammar if CEG is not let 1 For example: Refer example 388, | Qg What imit a, at are the limitations of recursive qj lescer a ut parsers ? | Refer section 3.6, EEE Se Seamnes eee pier D080" 3-55 Raia Syntax Analysis (q10 Write a recursive descent parser bexpr > bexpr or bterm| bterm | bterm -» bterm and bfactor| bfactor bfactor ~ notbfactor| (bexpr)| true| false Where or, and, not, (,), true, ‘for the grammar, Ans. 3 terminal symbols. bexpr = E blerm = T blactpr = F = or = and Now the grammar becomes EST |T TOT*R|E 1 F|(E) | true |false But the grammar is left recursive. We will elimi: following rule. inate the left recursion using if A>Aal|p then ABA‘ A’ aA 1. Fee ESTE Aha b ES+TE |e 2 ToTHE LF Torr Aiea Tor |e f 3. F1F|(E) |true|false does not contain left recursion. To summarize ESTE’ FE’ 4TE’ |e Torr TF’ e FO!FI®) [true [false TECHNICAL PUBLICATIONS”- An up theust for knowied? Scanned with CamScanner ompiler Design i g a gare ~ ‘The procedure for recursive descent parsing Procedure E () TOs Edash ( ); Procedure Edash () ‘ if (lookhead = '+') match ('+'); 1) Edash ( ); else null; } Procedure T( ) F(); Tdash ( ); i Procedure Tdash ( ) if (lookahead =’) match ( F() Tdash ( ); } alse null; } Procedure F ( ) { if (lookahead =") { ) match ('!); F(); ©lse if lookahead = '¢) { match ((' ); } else if (true) __ Few tue 3-56 ty Hy My MN Gookabead x ¥y y match ("9), Scanned with CamScanner compiler Desig? 3-57 ‘Syntax Analysis else return false; } it Eliminate ambiguity if any form the " pexpr — bexpr or bterm | bterm bterm — bterm and bfactor| bfactor bfactor ~> not bfactor| (bexpr) | true false Where or, and, not, (,), true, grammar for boolean expressions, false are terminals in the grammar. ‘Ans. : For the given grammar, assume bexpt = E bterm = T bfactor = F or = + and = * The grammar then becomes - E>EST|T TOT*F|F F!F|(E) |true |false This grammar is unambiguous grammar. It is obtained from the grammar - ESE+E ESE‘E ESIE Es true E- false, But the unambiguous grammar is left recursive. The left recursion can be eliminated. For elimination of left recursion refer answer of Q. 10. Q42 Consider the grammar given below EE+E| E-E| E*E| E/E|a| b Obtain leftmost and rightmost derivation for the string a+b* a+! [Ans + Refer example 3.3.2. Scanned with (amScanner va Compiler Design What are ti Ans. : Refer section 3.5.1 Q.14 Consider the following grammar sa(Lla | L>L, S|S Construct leftmost derivations and parse trees for the followi, i) (a,(a,a)) ii) te (wa), (a,@))) Ans. : Refer example 3.3.3 "8 Sentences. Q45 Explain backtracking with example. Ans. : Refer section 3.5.1(1). Q46 Compute FIRST and FOLLOW sets for all nonterminals in the Solo grammar. ny S$ Aa|bAc| Be| bBa And Bod Ee | Ans, : Refer example 3.89. | Q.17 Write a CFG for the ‘while’ statement in 'C’ language. Ans: The CRG G = En . * = {V/T,P,S} where V is a set of nonterminals, T is a at d terminal symbols, P is a set of production rules and S is a start symbol. The set d production rules P= { $—} while (condition) stmt condition + id relop id | relop >< [>| <=] >=] 1 8 while (condition) ( 1} L~ stmt L| stmt } V = (, condition, relop, L) ; a Scanned with CamScanner T= (while, (,),<,>,<5,5 ‘syllabus nto Simple LR - Why LR parsers - Model of an LR parsers - Operator precedence- Shift = Difference between LR and LL parsers, Construction of SLR tables. reduce parsing Contents 44 Bottom-up Parsing 42. Why LR Parsers ? 43 Model of an LR Parser 44 Operator Precedence .................. Aug/Sept-08, Set-1,4; .. May-08, Set-1, eee May-06, 04, Set-4,1, May-07, Set-3,4, Dec.-05, Set-1,2,4; April-03, Set-3, Marks 8 -- Marks 10 Shift Reduce Parsing 46. Difference between LR and LL Parsers 47 Construction of SLR Parsing Table May-06, 04, Set-2,3,4, . Aug/Sept.-07, 06, Set-4,1,3; . May-05, Set-3 May-05, 04, Set-3,1 Marks 12 Scanned with CamScanner Simy i Compiler Design Ie LR [EG Bottom-up Parsing In this section, we will dis " i sk is a program : ist compilers, the task is done by a pro a : : ty iter it checks the input string completely for its eee ae error my, ld. on syntactically input strings. In bottom-up parsing, a hod, the oe String js ti byes i 't ‘i ‘ammar first and we try to reduce this string with the help of gre and try to coin : start cymbal. The process of parsing halts successfully aS Soon as we reach 4g ra s "How an input string gets parsed effig, alled parsers. The task of parsop ently » symbol. , 7 The parse tree is constructed from bottom to up that is from leaves to root, In th process, the input symbols are placed at the leaf nodes after successful parsing bottom-up parse tree is created starting from leaves, the leaf nodes together are rey, further to internal nodes, these internal nodes are further reduced and eventually 8 roy node is obtained. The internal nodes are created from the list of terminal and non-terminal symbols. This involves - Non-terminal for internal node = Non-terminal U terminal In this process, basically parser tries to identify R.H.S. of production rule and replay it by corresponding L.H.S. This activity is called reduction. Thus the crucial but prime task in bottom-up parsing is to find the productions that can be used for reduction. The bottom-up parse tree construction process indicates that the tracing of derivations ar to be done in reverse order. For example Consider the grammar for declarative statement, SoTL; Tint | float L> Lid | id The input string is float id id,id; Parse Tree Step 1: We will start from leaf node Step 2: = | i float | Scanned with CamScanner a 4-3 Simple LR Parsers Read next string from input. Reducing id to L. L id. t Co) © Step 6: Read next string from input. ‘ U Cot) © d Read next string from input. ‘ebbbs id id gets reduced. i Edbvd TECIMImAL eine inwrinaie™ An vn test for knowledge step 4 step 5: ‘Step 7: Scanned with CamScanner Fig, 4.1.1 Bottom-up parsing ; : tial form produced while constructing this parse tree is 4 Step 10: The senten! a ! T id,id; i T Lid; ; TL Ss Step 11: Thus looking at sentential form we can say that the rightmost derivation = reverse order is performed. Thus basic steps in bottom-up parsing are 1) Reduction of input string to start symbol. ! 2) The sentential forms that are produced in the reduction process should trace | rightmost derivation in reverse. Handle Pruning As said earlier, the crucial task in bottom-u; P parsing is to find the substring tt uch a substring is called handle. se ante is a sting of substring. that matches the right side ¢ e : i production, Such reduction rpreece mane ey 2 mon-terminal on left hand $8 Fomlly we can define handle as follow, P MO"8 the Feverse of rightmost deriva Handle of right sentential r d form y is i | where the string B may be found and rie pee | Sentential form in rightmost Servation of,” “ ¥° Produce the previous #8 ——————_Sivation of y", SS | For example Consider the grammar . E+ E+E TECHNICAL PUBLICATIONS" Scanned with CamScanner 8 Sie t Parers ~ id Now consider the string id + id + id and the rightmost derivation is > E+E EO E+ESE Eo E+ Eeid Ba id + id 3S id+id sid ‘The bold stings are called handl Right sentential form id + id + id Thus bottom parser is essentially a process of detecting handles and using them in reduction. This process is called handle pruning. (EEE Why LR Parsers ? This is the most efficient method of bottom-up parsing which can be used to parse the large class of context free grammars. This method is also called LR(k) parsing. Here * L stands for left to right scanning. 2 * Rostands for rightmost derivation in reverse. * kis number of input symbols. When k is omitted k is_assumed to be 1. ° Properties of LR parser - «UR parser is widely used for following reasons. 4 1 EX parsers can be constructed to recognize most of the programming languages for ’ which context free grammar can be written. 2 Te class of grammar that can be parsed by LR parser is a superset of class of Srammars that can be parsed using predictive parsers. a TECHNICAL PUBLICATIONS” An up thrust for knowiodge rt Scanned with Ting 4-6 Compitor Derign , shift reduce a 4, LR parser works UNE non backtracking shift reduce technique yet one. actical errors very efficiently. LR Parsers detect synt 14:3) Model of an LR Parser ‘The structure of LR parser is as given in following, Fig. 43.1. yuffer INPUT Token It consists of input b for storing the input string, 4 Glo] stack for storing the grammar ist symbols, output and a parsing table comprised of two parts, namely action and goto. There 5, is one parsing program which is actually a driving program and reads the input symbol one at a time from the input buffer. Stack The driving program works on following line. 1. It initializes the stack with start symbol and invokes Parsing table Fig. 4.3.1. Structure of LR parser scanner (lexical analyzer) to get next token. He Lp a it Ig fy, hy 2. It determines s, the state currently on the top of the stack and a, the current inp! symbol. 3. It consults the parsing table for the action [s, .a,] which can have one values. i) 5, means shift state i. ii) 4, means reduce by rule j. iii) Accept means successful parsing is done. iv) _ Error indicates syntactical error. Types of LR parser Following diagram represents the types of LR parser. Seanned en LR parsor SLR parser LAL parser Canonical LR parser Fig, 4.3.2 Techniques of LR parsers The SLR means simple LR parser, LALR means Lookahead LR parser and canonical LR or simply “LR” parser - these are the three members of LR family. The overall structure of all these LR parsers is same. All are table driven parsers. The relative powers of these parsers is SLR(1) ¢ LALR(1) < LR(1). That means canonical LR parses larger class than LALR and LALR parses larger class than SLR parser. EZ] Operator Precedence ‘A grammar G is said to be operator precedence if it posses following properties - | 1. No production on the right side is. 2. There should not be any production rule possessing two adjacent non-terminals at the right hand side. ee a - Consider the grammar for arithmetic expressions. E-EAE | (E) | -E | id Ase l-1tl/ 1% This grammar is not an operator precedent grammar as in the production rule. E> EAE It contains two consecutive non-terminals. equivalent operator precedence grammar by removing A. E> E+E|E-E|E*E|/E/E|B*E E> @|-E| id In operator precedence parsing we between pair of terminals. The meaning p gives more precedence than q: Hence first we will convert it into will first define precedence relations <-*and-> of these relations is Po p=q p has same precedence as q. pq p takes precedence over q. These meanings appear similar to the less than, equal to and greater than operators. Now by considering the precedence relation between the arithmetic operators we will TECHNICAL PUBLICATIONS”. An up thust for knowledge Scanned with CamScanner fence table. The operators precedences wo jay oy Ave construct the operator preces = are id. Mn, ’wo+ * $ | | | | t | id | | i | } | |S ee eetoges | L a 2 Fig, 4.4.1 Precedence relation table Now consider the string. id + id eid We will insert $ symbols at the start and end of the input string. We will als, ines precedence operator by referring the precedence relation table. S4< id >$ We will follow following, steps to parse the given string - i) Scan the input from left to right until first -> is encountered. encountered. ti) Scan backwards over = until < - is ini) The handle is a string between <-and +>. The parsing can be done as follow S<-id->sc-id >s Handle id is obtained between <-> i Reduce this by Es id | Ercid pe cid ->$ Handle id is obtained between <--> E— id \ Reduce this by E+ Be cid oo Handle id is obtained between < Reduce this by E > id Esbek " } Remove all the non-terminals. 7 Insert $ at the beginning at the enh AB | bs insert the precedence operttors, "| =e A ae! | the * operator is surrounded & | indicates that * becomes hal | { Sseeng We have to reatuce ESE operation ft -— : ve ea | handle, Hence we & : sed Scanned with CamScanner Advantage of Operator Precedence Parsing ‘Simple LR Parsers 1. This type of parsing is simple to implement, Disadvantages of Operator Precedence Parsing 1. The operator like minus has two different is hard to handle tokens like minus sign, 2. This kind of parsing is applicable to only sm. Difference between Operator Precedenc: Precedence Parser Sr. Operator Precedence Parser No. The operators <-, ->, involved to show the relationship ___among the symbols. This is not an efficient parser. For only the terminal symbols the _ precedence is defined. ‘This type of parser is simple to implement 5. This type of parser is applicable to of grammar. 6 The operators li binary has two different precedences. Hence it is difficult to handle such tokens. minus, unary or Operator Precedence Parsing Algorithm precedence (unary and binary). Hence it all class of grammars. e Parser and Simple rea Simple Precedence Parser The operators are not used to show the precedence ‘The precedence is defined for both terminal and non terminal symbols. | This type of parser is complex to | implement. | This type of parser comparati applicable to f ‘The unique precedence relation can be applied to the tokens. | Doce 1. Set i pointer to first symbol of string w. The string will be represented as follows, [s +a i 2. If $ is on the top of the stack and if a is the symbol pointed by i then return. 3. If a is on the top of the stack and if the symbol b is read by pointer i then TECHNICAL PUBLICATIONS”. An up thrust for Knowiedge "Scanned with CamScanner | a)if a<-b or a =b thon push b on to tho stack advance the pointer i to next input symbol. b) else if a->bthen ‘ De (top symbol of the stack. > recently popped terminal Symbol) { pop the stack, Here popping the symbol means reducing the to, : symbol by equivalent non terminal. Mina c)else error ( ). y SEEEEERED construct an operator precedence parser for the grammar, 47 SEIS] iE1SeS| a E-sble|d Where a, b, c, d, e, i, tare terminals. Solution : Let, S— iEtS| iEtSeS|a E>blc|d be the given grammar. Here, i stands for if E stands for Expression t stands for then S stands for statement The symbols a, b, ¢, dare the terminal symbols, * The precedence if > then > else, : i>t>e * Similarly § symbol will have least precedence, * The terminal symbols a, b, c, and d have highest precedence over i, tande. ve less precedence over a, b, ¢, dy i, e, t Hence & igned as follows - * The nonterminal symbols ha Precedence table can be desi Scanned with CamScanner Simulation consider a y ee Input Scanned Relation | § ibtaea $ - j- ——— = | fecas Reduce byE>b | sit taca$ eo . ) sits gas : Reduce by soa | | Sifts eas | sipse as Push a ese $ 7 7 |} i : sais bye i ss $ ACK alid string ibtaea for simulation. Sp in Pasers Precedence Functions During operator precedence parsing, the table of precedence relation is not stored. Instead of it, the operator precedence parser use precedence functions to map terminal symbols to integers so that the precedence relations between the symbols are implemented by numerical comparison. For example Consider two functions f and For these functions the precedence relation can be shown with some integers. Scanned with CamScanner “2 ces Precedence table - le we < precedence tb Output: Precedence function 1 Create functions &, ands. for each grammar terminal a and for S. 2 Partition the cmbols in groups so that £, and g, ave in the same group 3. Create a directed graph by adding edges, in following manner ac-b then add an edge from and g,and §.- 5) Fa->b then add an edge from & and gp. 4. I the constructed graph contains no cycle then there exists precedence fume 8 4 4 ot : = 4 aoe 5 2 a 3 eo | Peach Scanned with CamScanner 4.4.2 Precedence graph Fig. [EEA shift Reduce Parsing aves to root. Thus it works nstruct parse tree from le 1 requires following data attempts to cor A shift reduce parse Shift reduce parser { bottom up parser. on the same principle of structures. 1. The input buffer storing the input and accessing! The initial configuration of Shift stack ss reduce parser is as shown below. The parser performs following by Fig. 4.5.1 Initial configuration t string. he LHS. and RS. of rules: 2. A stack for storing Input buffer asic operations. tack, this action is called oo + Moving of the symbols from input buffer onto the shift. 2. Reduce : If the handle appe appropriate rule is done. That means in. This action is called Reduce action. 3. : er If the stack contains start symbol only fame tne then that action is called accept. When accep! ? parsing then it means a successful parsing is done. ‘Error = a t Stor: A, sition an which parser cannot either shift or reduce the symbols, it ven perform the accept action is called as error. tat for knowlege reduction of it by the stack then d LHS. is pushed ars on the top of ed of ani RHS. of rule is popP' is empty at the .ed in the process and input buffer t state is obtain TECHNICAL PUBLICATIONS”- An up. Scanned with CamScanner 7206 he ompier Design shift-reduce parser th take some examples t0 lear Let us Consier the gram” EsE-E EOEE ce sn se of the input striNS vqqhid2nid3”. perform sf-Redece parsing of Solution ¢ erase as | 2 Parsing action shift = [ ~id2vid3 Reduce by E> id _ -id2¥id38 out se aiess | Us id2eid3S shift —— id3$ Reduce by E> id = —_ shift id3$ Shift : Reduce by E> id = Reduce by E> EXE 8 Reduce by E-» E-E +n Here we have followed two rules, 1. If the incoming operat i Ht Perator has more priority than in stack operator then perfor 2 If in stack operator me of ior rity of incoming oper perator has same or | ty Priority less, ; Priority than the prio: Consider the following grammar T >in | float L Lid | ia Parse the input stri string ink id, PM String int id, using shift-red '§ shift-reduce parsey Scanned with CamScanner sotution * Stack 1 ; eer Varsingy action sint i" ais i‘ — Reduce by Po tnt - Within Shute sm als Reduce by bes id stl. ius Shit | __ stl, las Shit $tLid s mn \ ‘ Retuce by LoL, i st 8 Shift = Reduce by $9 TL; 4 Accept ‘ Consider the following grammar i Sa a L3L$|$ Parse the input string (a, (a, a)) using shift-reduce parser. Solution = Input buffer Parsing action 7 (a) $ Shift ala) $ Shift ay Reduce S99 i (a, ay$_ Reduce =» $_ 7 (a, a)) $ Shift Shift Reduce $ @ Reduce L.-» § ‘shift shift Reduce $4 Low are thrust for knowlod® TECHNICAL PUBLICATIONS” A” ¥? Scanned with CamScanner sooso|ist|2 Solution : To design the VW Sr. No. \$ Reduce $ > (L) $ )$ Reduce LL, § )$ Shift | $ Reduce $ > (L) Accept Desig shift reduce Parser for He following grammar : sign ift-reduce parser we will consider the input "10201", re i Stack Input Buff s__t2018 si $10 Reduce $—> 1S1_ Accept EQ] Ditterence between LR and LL Parsers LR Parsers ‘These are bottom up parse This is complex to implement is simple to implement or LL(I) the first L_means the inpot is scanned from left to right. The derivation in Second L. means it uses leftmost {The number 1 derivation for the input string. TH lookahead syeibol to.” svamber 1 inhcatée tat of Parsing process, lookahead symbol to predict the parsing process For LR() the first L means the scanned from left to right. The means it uses 1 reverse for the input stri indicates that one predict the input is second R These are efficient parsers, ‘This is less efficient. Iis applied to a large class of i programming languayes, i ss9 of is applied to small class of languages. ie TECHNICAL PURL ean Scanned with CamSecanner We will start by the of the three methods bu | Simple LR Parsers simplest form of LR Parsing called SLR parser. It is the weakest it is easiest to implement. The Parsing can be done as follows. Context free grammar Construction of canonical set of items Construction of SLR parsing table Parsing tint sing Lal aes Output Fig. 4.7.1 Working of SLR (1) Definition of LR (0) items and related terms - 1) The LR(O) item for grammar G is production rule in which symbol # is inserted at some position in RH. of the rule. For example The production $e generates only one item S-> ». 2) Augmented grammar : If a grammar G is having start grammar is a new grammar G’ in which S’ is a new start symbol such that SS The purpose of this grammar is to indicate the acceptance of input. That is when Parser is about to reduce $5 it reaches to acceptance state. ‘A grammar for which SLR parser can be constructed is called SLR grammar. symbol S then augmented 5) Kemel items : It is collection of items S—> +S and all the items whose dots are not at the leftmost end of RHS. of the rule. Non-Kernel items : The collection of all the items in which are at the left end of RUS. of the rule. 4) Functions closure and goto : These are two important functions required to create Collection of canonical set of items. TECHNICAL PUBLICATIONS”. An up trust for krowiedge Scanned with CamScanner eel | | It is the set of prefixes in the right sentential form wise | Netigg | _ 5) Viable prefix / A a. This set can appear on the stack during shift/reduce action, of p | Closure operation - For a context free grammar G, if 1 is the set of items then the function losin be constructed using following rules ou 1, Consider 1 is a set of canor items and initially every item 1 iy ag et closure(). b 2.1F rule Ae eB is a rule in closure(l) and there is another rule for g Such 5 By then closure(I) : A > ce ©BB Booey This rule has to be applied until no more new items can be added to closure(t, The meaning of rule A a © BB is that during derivation of the input string at Some Point we may require strings derivable from BB as input. A non-terminal immediately the right of « indicates that it has to be expanded shortly. Goto operation - The function goto can be defines as follows. If there is a production A 0 ¢BB then goto(A +a ¢BB,B) = A OB +B. That mens simply shifting of * one position ahead over the grammar symbol (may be terminal « non-terminal). The rule A > ¢ BP is in I then the same goto function can be written 3s goto(1,B). Consider the grammar X— Xbla Compute closure(l) and goto(1). Solution : Let 1:X3eXb Closure(1) =X Xb =X0« ae x LXMb gives xX Xeb The goto function can be computed as goto(X) = X > Xeb Similarly goto (I, a) gives X— ae EEGEZZEZ) Consider the grammar S$ AS| b A>SA|a Compute closure (1) and goto (1). xo fa gives Xa Scanned with CamScanner st write the grammar using dot operator, jon: WE w otfOP se 0 AS garb Ace ‘| Closure (I) Ase this as state as Ig. is call pets Sasa ow we aPPlY Ip: SHAS Seb A-+SA Anea £ goto (Ip, A) SoAeS Seb AseSA Astea + goto (Ip, b) So be 900 (Ip, S) A>SeA As>eSA Avea S—*AS Soeb Ig: goto (Ip, a) Avae Consider the following grammar : S-+Aa] bAc| Be| bBa Aad Bod Compute closure (1) and goto (1)- Solution : First we will write the grammar using dot operator. In :S Aa Ss sbAc $+ Be S ebBa Ana es ® will apply goto on each symbol from state To. F Fe Sut —— states, sSeanned with CamScanner Each goto (I) will generate 4-20 Simple ip ompier Design 1, : goto (Io A) go Asa h + goto (Io/b) S— beAc 5 > beBa Aaed Boed Ig : goto (lo-B) $3 Bec Ty + goto (Ip/4) Ao ds Bode Construction of canonical collection of set of item - 1. For the grammar G initially add S'— eS in the set of item C. > For each set of items I, in C and for each grammar symbol X (may be termins « nor-terminal) add closure (IjX). This process should be repeated by appisis goto(l,X) for each X in I, such that goto(|;X) is not empty and not in C. The set items has to constructed until no more set of items can be added to C. Now we will consider one grammar and construct the set of items by applyisg closure and goto functions Example : E>E+T E>T ToOT*F TTF F>() Fid In this grammar we will add the a ° 2 * en to apply closure(1). tigmented grammar E'—>«E in the 1 Scanned with CamScanner Simplo LR Parsors | E>eE+T EseT ToeTer Ter Fe(B) ‘The item Ip is constructed starting from the grammar E’ « E. Now immediately right to eis E. Hence we have applied closure(I)) and thereby we add E-productions with * at the left end of the rule. That means we have added Ee E + T and ET in Ip. But again as we can see that the rule E> ¢ T which we have added, contains non-terminal T immediately right to *. So we have to add T-productions in Ip. Toat+Fand To 6 F. In T-productions after ¢ comes T and F respectively. But since we have already added T productions so we will not add those. But we will add all the F-productions having dots. The Fe (E) and F— id will be added. Now we can see that after dot and id are coming in these two productions. The ( and id are terminal symbols and are not aie “eriving any rule. Hence our closure function terminates over here. Since there is no Prue further we will st. ting |, e will stop creating Ip Now apply goto(|yE) Eee Shiftdotto ES E« EseE+T fight Es Ee+T Thus 1; becomes, goto(ly, E) lie E> Ee? Since ig L js ee al 7 y m 4 there is no non-terminal after dat we cannot apply closure(1,). “PPIVINg goto on T of lp, ~ Scanned with CamScanner ven ae inal after dot we cannot apply closure(l,), Since in I, there is no non-te! By applying goto on Fofly, a goto(ly, F) i TOF Since after dot in I there is nothing, hence we cannot apply closure(I,). By applying goto on (of Ip But after dot E comes hence we will apply closure eq then on T, then on F. goto(ly () | ki TE) | EseE+T | Eset ToeTsF | ToeF F>e() | F seid — By applying goto on id of Ip, 8oto(Iy, id) Ig: Fide Since in 1; function here, applying goto. In I, th e 1, there are two producti Point applying goto on F Productions B° goto, > Ee, hence we will ¢ there is no non- terminal to the C01 . re pre RE DE Bots on I. We will consider hi Fight of dot we cannot apply «los > Ee and E> Be 4 7. Thee 8% onsider E’5 Ee +T for application TECHNICAL PUBLicianina.om Seanned with (am Scammer Simplo LR Parsors compier Desi” FT (8) | | | | | | | Fold qhere is no other rule in 1; for applying goto. So we will move to 1, In f; there are > Teand T-> Te*E, We will apply goto on two prontuctions goto(ly, *) | ly: ToT+eF | F> e(E) Foeid | The goto cannot be applied on ly Apply goto on E in I, In I, there are two productions having E after dot (FE and E-*E+T ). Hence we will apply goto on both of these productions. The Ig becomes, E>Ee+T | | [ a | gotolly, E) | (igs FICE) | | | | # we will apply goto on (Iy, T) but we get E> Te and T~ Te*F which is I, only Hence we will not apply goto on T. Similarly we will not apply goto on F, ( and ict as 0 get the states 15, I,, Is again. Hence these gotos cannot be applied to avoid repetition. There is no point applying goto on Is, Hence now we will move ahead by applying set0 on I, for T, gotolls, T) | yi BOB e Ts Tote Scanned with CamScanner Compiler Design Then, goto(ly, F) Io: ToOT#ke Then, golo(ly)) In F> (Be Applying goto on Ip Tio, Ii; is of no use. Thus now there is No ites added in the set of items. The collection of set of items is from Ip to 1). ™ that can by Construction SLR parsing table - As we have seen in the structure of SLR parser that are two table and those are action and goto. By considering b: reduce, accept and error we will fill up the action using goto function. Let us see the algorithm - Parts of SLR paring asic parsing actions such as shi, table, The goto table can be filled vp Input: An Augmented grammar G’ Output : SLR parsing table. Algorithm : J. Initially construct set of items C = {Ip Ty, Tp. LR(0) items for the input grammar G’. I,] where C is a collection of set! 2. The parsing actions are based on each item J, The actions are as given below. @ IfA a eaB is in and goto(t, a= J, then set actionfi, a] as “shift j': No that a must be a terminal symbol. b. If there is a rule A Sa eis i vie a in I; then set action|i, a] to “reduce A 76" all symbols a, A where a € FOLLOW(A). Note that A must not augmented grammar S’, . IFS’ Sis in I, then the entry in the action table action{i, $] ="accePt” tate! 3. The goto part of the SLR table can be filled ag oe + The goto transitions | considered for non-terminals only. If Bolo(l;, A) = I, then gotofl, Al =i 4. All the entries not defined by rule 2 and 3 are considered to be “error”. # 7, Scanned with CamScanner Sunny ER Pwe ESET NT Best construct a collection of canonical sel af items fr the above grammar. The set of items Semerated by this methad are also called SER) Tema, Ay there is no lookahead symbol & 2e00 i prat in the bracket HAY) Qi Es Se Esde+T 9 (LE) Tatesk WERE | BE est | 928 (hs. F) — — (ig: Take got (lg, T) oo IgsESEsTe oe oe ee) — EseE+T | | goto (ly, F) | EaeT Tae Ter Fase) Fees 190% (lg id) | | ag:Foide | TECHNICAL PUBLICATIONS” Anup trust or hnoniedgo Scanned with CamScanner ee We can design a DEA for above set of items as “Oo To state 'y To stato 4 Scanned with CamScanner Design : compen 4-27 Simple LR Parsers In the given DFA every state is final state, The state Ip is ini pea recognizes viable prefixes. Ig is initial state. Note that the For example - For the item, 1, : goto (Ip. E) ESEe E> Ee+T Tg : goto (Iy, +) = ESE+eT Ig: 99to (Ig, T) ESE+T+ ‘The viable prefixes E, E+ and E+T are recognized here continuing in this fashion. The DFA can be constructed for set of items. Thus DFA helps in recognizing valid viable prefixes that can appear on the top of the stack. Now we will also prepare a FOLLOW set for all the non-terminals because we require FOLLOW set according to rule 2b of parsing table algorithm. [Note for readers : Below is a manual method of obtaining FOLLOW. We have already discussed the method of obtaining FOLLOW by using rules. This manual method can be used for getting FOLLOW quickly] FOLLOW(E’) = As E’ is a start symbol $ will be placed = {5} FOLLOW(E) : E+E. that means E’ = E = start symbol. . we will add +. we will add $. E-EsT the + is following E. - F5(E) the) is following E. -.we will add )- + FOLLOW(E) = (+, ), $} FOLLOW(T) : ASE’ +E, ET the E’ = E=T = start symbol. --we will add ESET EST ET The + is following T hence we will add +. Scanned with CamScanner 4-28 Si Compiler Desi ToT#F ve will add *. ‘As «is following T F>€) Fo As ) is followin} FOLLOW() = (+8) FOLLOW(F) : AsEOE,E OT and T > F the E’ = E=T =F = start symbol. We will add , eB oT. eT we will add ) ESET EoT+T wEOT EORT sTOF The + is following F hence we will add +. ToTtF ToFeT TOF ‘As « is following F we will add *. F> (B) * F(T) E>T Fo () TOF As ) is following F we will add ). FOLLOW(@) = (+, *, ), $} Building of parsing table Now from canonical collection of set of items, consider Ip. E>eEsT EseT ToeTeR Toor Fe (E) Foeid Searined with CamScanner Design jider F> ° (E) Aca ,a=(B=E) Cons! Ache goto, = Ie saction(0, (1 = shift 4 similarly for F ¢ id gntry in the action table action[0, id] = shift 5 ~> goto(ly id) = 1, ther item in Ip does not give any action, Hence we will find the actions from Ip to In State Action id] +] *]cClo|s [os | st pr [= 6 Z| 7 fT T $8 4 8 sf 3s 4 $6 - 3 7 10 n Thus SLR parsing table is filled up with the shift actions. Now we will fill it up with reduce and accept action. According to the rule 2.c from parsing table algorithm, there is a production E'> E* 1], Hence we will add the action “accept” in action[1, $]. Now in state h Est. Anas a tule 2.b AtBaet TECHNICAL PUBLICATIONS” An up thust for knowiodgo Pe Scanned with CamScanner OF a yay An a2 Compiler Design ; . ttn by traction, a © reduce Afb then pop 2°IBL syMbo!. i th whack they push A, then pus goto AL OW ANE LOP OF the gig tay spt then halt the parsing, process. 1 indicg ©) IE action|$, a} = ace parsing, Let us take one valid string for the grammar NE ENT DEST HT 9 TE yt > F 5)F (ED 6) F > id Input string : id idvid taking, the parsing act We will consider two data structures while 8 and thos ——. ck and input buffer. : P; ™ Parsing action | Input buffer Action table Goto table Reduce by F-> id ___Reduce by T F i __ Reduce by [ = “Reduce by E2T wel _ shift | sore shitt_ sornsesas aid | SEL+619 I In the above table at first row we get action(0, id] input buffer onto the stack and then push the state number get action[5, *J as r6 that means reduce by rule 6, E> id, | referring, goto[0, F] we get state number 3 hence ECHNICAL PUBLICATIONS? Scanned with CamScanner Design - comple. sae Simplo LR Parsors giack. Note that for every reduce action goto is performed. This process of parsing. is cetinued in this fashion and finally when we get action{1, $] = Accept we halt b successfully parsing the input string, , pl alt by Consider the following grammar EoOE+T|T TTF | F porelalb Construct the SLR parsing table for this grammar. Also parse the input a* b+ a. olution : Let us number the production rules in the grammar. 1)ESE+T QEST 3)T 3 TF TOF 5)F>F* 6)Foa NE3b Now we will build the canonical set of SLR (0) items. We will first introduce an augmented grammar. We will first introduce an augmented grammar E’ — +E and then the initial set of items Ip will be generated as follows. | _. Asafter « the symbol E appears we will add rules of E. |__. After « the T appears, so add rules of T |. Asafter « the F appears, so add rules of F Now we will use goto function. From state Ip goto on E, T, F and a, b will be applied step by step. Each goto transition will generate new state I;- Ty: goto (19, E) Esk ESET Scanned with CamScanner ‘Compiler Design Tq : goto (I, T) EoT. ToOT+F FoeF* Foea 15 + goto (Ig, F) a ToFe® we q FR)Fe* J j 4 | Ig = goto (Ig, a) %, q Fae S HE Is : goto (Ip, b) Fobs =| Now we will start applying goto transitions on state I,. From I, state it is Possible y ¢ apply goto transitions only on +. Hence ¢ ae 4 Tg: goto (Li, +) i ESE+eT ] T-+TF As after * the T comes will add T transitions . Tek same is true for F. | = } Foea Fob The goto transitions will be a transition because there is no po Ty = goto (1, F) TTF. LioiTore | If we will apply goto on a or : ' bf : on Which are states Land Ig rspetey, NH then we will get Fae PPlied on Iz state now. We will choose F to apply 5% int in applying goto on T. TECHNICAL PUBLICATIONS”. an yp theust for knowiedge & Scanned with CamScanner compor Desi 4236 snghes LF Yarsans Hence We will not consider Ly and Is again, Now more ty state ba; from by the yo transition" * Hence ile | Fy egoto (Lae) | foohet 1s no point in applying, goto on state ty and fy, We will chonne state by, for As there oto tration Fy goto (lor 1) | | | | | | Fea (Bob Now we will first obtain FOLLOW of ET and F, As the FOLLOW computation 's required when the SLE parsing, table is building, FOLLOW(E} (+, $) FOLLOW(N)= — f+, a,b, $I FOLLOW(F)= {+, 4a, b, $} As by rule 2 and 4, F -» T and T > F we can state E F, But E is a start symbol. Then by rule 2 and 4, T and F can act as start symbol. «we have added $ in FOLLOW(E). FOLLOW(T) and FOLLOW(F). The SLR parsing table can be constructed as follows. State Action TECHNICAL PUBLICATIONS” An up thst for ecwtota® Scanned with CamScanner | i p + ausing above parse table. ut a * S Input buffer name the IAP Now we will parse the InP See asbeaS- reduce by Fv Shift reduce by F—> F + reduce by T3 F Shift reduce by Fb aS if tt a byTF sa nee by BEST SE a Accept Thus input string is parsed, " Construct the 7 F ie i following grammar 5 _, bara LRO) item sets and draw the goto s™ Indicate the confticts( Soluti if any) i 5 Solution : We wit) number . in the various states Of the SLR parser. the i Production rules in the grammar. ~~“ Seanined with CamSeanner pier Desig” 4-37 Simple LR Parsors jet us construct canonical set of items - : Ses Sess Sea Sas yt goto (Ip, S) Ses S3Ses S—+ss Sea Se hh: goto (Ip, a) Srae goto (I, S) S3SSe SSeS S+ ess Sea Se The goto graph will be Now This is augmented grammar, Te ~ ‘CHNICAL PUBLICATIONS”. An up thst fr knowodgo Scanned with CamScanner Compiler Design 4-38 The construction of SLR(1) table will be. get Ip state. Hence M[3, a] = $2. In I, there is a production S SS + whi, with A a Rule S— SS is rule number 1. And FOLLOW(S) = {a,S} -. MS al s Sl=n, Goto 1 82 Accept 3 2 m nm 3 “s2 FT AN v Conflict occurs. [shift-reduce conflict] Hence conflict occurs in state 13. Construct LR(O) parsing table for the following grammar $— cA|ccB AacAla - B->ccB|b Solution : We will first number the grammar rules INS >cA 2)S > cB 3)A>cA 4)A>a 5) B> ccB 6) Bob Now we will construct canonical set of items Ip: SiS SecA S—>eccB A>eCA A-ea TECHNICAL PUBLICATIONS” An up thrust for knowledge Simp up Pe, In I, state there is a production rule $ > «a . If we apply goto (I, a) 4, en We ty va eh, My Scanned with CamScanner ~ | | | | ne Bo ecB Boeb goto (lo 5) g+Se goto (Ip ©) gsaceA sac AsceA Bocce Az Océ Azea goto (Ip, a) | A>ae ye goto (Ip b) | Bo be goto (Ip, A) SacAs AacAS It goto (Ip) scceB B—>cceB AacA AwecA Avea Beb bose bi goto (le, B) S+cBe BocBe Ie goto (Ik, A) a i As+cAe f TANS Sah oy oS I af TECHNICAL PUBLICATIONS" Anup tust or rowed? Scanned with CamScanner Compiler Design yt Tyo? 4-40 goto (ly poco’ A Ase Aa eck Asea goto (ly 6) BocceB AsceA Aaeck Azee B— eccB Bo eb goto (Tyo, B) B= Be FOLLOW) ={$1} FOLLOW(A) = [$1 - FOLLOW(5) = {$1} ‘The SLR parsing table can be constructed as follows - Action a i c $s s b s | s | 2 Scanned with CamScanner sompler ea 4-41 Simplo LR Parsers Q2 Ans. : Qs Ans. a4 Ans. : Qs Qé What is an operator grammar? Give an example Refer section 4.4. Write an operator precedence parsing algorithm, What are the advantages and disadvantages of operator precedence parser ? CRA, Refer section 4.4.2. Refer section 4.4. Construct an operator precedence parser for the grammar, S$ iEtS| iEtSeS| a, E>b|c| d Where a, b,c, d, e, i, t are terminals. WATS A a c Refer example 4.4.1 List out the rules for constructing the simple precedence relation for a CFG. DAIS ee a : Refer section 4.4. What is a SLR grammar? or LT ee eda a Define LR(O) mmar. efine LR(O) gra on faa What is an LR(O) grammar? Ans. : Refer section 4.7. Q7 Construct SLR parsing table for the following grammar, ESE+T|T, TOTE|F, F>F*|alb Ere Ans. :_ Refer section 4.7.5. Scanned with CamScanner 4-42 ot ‘ompiler Design Qs Distinguish between top-down and bottom-up parsing. Ans. : Sr. Top down parser Bottom up parser Fete 1 Parse tree can be built from root to leaves. This is simple to implement. This is less efficient parsing techniques. Various problems that occur during top down technique are ambiguity left recursion eae | 5 Parse tree is built from leaves to toot | | | This complex to to implement. When the bottom up parser handles ambiguous grammar conflicts occur in parse table. It is applicable to small class of languages. Various parsing techniques are 1) Recursive descent parser 2) Predictive parser It is applicable to a broad class of languages. Various parsing techniques are 1) Shift reduce 2) Operator precedence 3) LR ‘Parser. | Scanned with (amScanner More Powerful LR Parsers May-09, Set-4, +++ +++ 9505+ Marks 8 on of LR Parsers ise Ambigui May-09, Set-3, ++ Marks 8 very in LR Parsing ic Parser Generator Scanned with CamScanner Compiler Dian 5-2 Moro Poverty * Pa, Construction of CLR(1) ‘The canonical set of items vm generated while constructing set of items, Hence the collection of set of items ig ne 4 as LR(1). The value 1 in the bracket indicates that there is one lookahead symbo set of items. ny We follow the same steps as discussed in SLR parsing technique and those are 1. Construction of canonical set of items along with the lookahead. 2, Building canonical LR parsing table. 3. Parsing the input string using canonical LR parsing table. Construction of canonical set of items along with the lookahead 1, For the grammar G initially add S’> «Sin the set of item C. 2. For each set of items I, in C and for each grammar symbol X (may be terminal « non-terminal) add closure (I;, X). This process should be repeated by apply, goto((I;, X) for each X in I; such that goto((J,, X) is not empty and not in C. The se of items has to constructed until no more set of items can be added to C. 3. The closure function can be computed as follows. For each item A a eX Ba and rule X-y and b € FIRST@ a) such that X ~ © y and b is not in I then add X +¢ y, b tol. 4, Similarly the goto function can be computed as : For each item [A + aX, a] is in I and rule[A 0. X¢f,a] is not in goto items then add [A > aX¢B,a] to goto items. This process is repeated until no more set of items can be added to the collection C ERs s s>cc Ca acld Construct LR(1) set of items for the above grammar. Solution : We will initially add S'S, $ as the first rule in I. Now match S'S, $ with [A >0¢Xp,a] HenceS'+ « $$ AsaeXp,a A=Sa &X=SB=ea2$ TECHNICAL PUBLICATIONS” An up twust for knowledge ~~ Seanned with CamScanner ~~ avis a production X —>y,b then add Xe y, ae ck. be FIRST @a) . be FIRST S) ase $= § be FIRSTS) b= Is} ge CC, §$ will be added in Ip. Now $2 # CC, $is in ly we will match it with Aa eXf,a Now azs,aceX=CB=Ga=s if there is a production X — y, b then add X — « y,b ca eac, be FIRST(B a) Coed be FIRST(CS) b ¢ FIRST(C) as FIRST(C) = {a, d) = lad) C= aC, a or d will be added in Ip. Similarly C> * d, a/d will be added in Ip- ic Hence Ip: | | sess | | Saecas | CreaCa/d | | Ced,a/d Ls Here a/d is used to denote a or d. That means for the production C— ¢ aC, a and Gerad Now apply goto on Ip. GFoeSs S$ +eCCs Cac, asd Ced,a/d Scanned with CamScanner Hono | goto(ly, 8) | ars S98 Now apply goto on C in ly. S$ CeG,$ add in ly. Now as after dot C comes we will add the Males of q X=eC pee, aas X—> ey ,b where be FIRST a) Co ead be FIRST $) Coed be FIRSTS) Hence Co eal, $ and C ed, $ will be added to 1; ; : ; Ty: goto(ly, C) Now we will apply goto on a of Ig for rule C > © aC, a/d that becomes Ca et a/d will be added in 1). Cr aeGa/d ASA=GazaX=GPecaza/d Hence X > ey, b Cac be FIRST@ a) Coed be FIRST@ a) or FIRST¢ d) bea/d Hence a | Ig: goto(ly, a) CaaeGa/d | C> aC ad i | | | 9 L _ a 2 & i Scanned with CamScanner we apply Now comes ne 1 econ wr 14: goto, a | codeald £7 As Pi 16 aa | geen Ld Ig: gotol(l, C) in the rule C goto on d of Ip I, a/d then we get C+ d a ing gotos on Ig is over. We will move to I, ply a/d but we get no new production ‘will apply goto on C in Ip, And there is no closure possible in this state. ve ee ve will apply goto on a from I, on the rule C-> ¢ aC, § and we get the state I,. We wil caarG$ AvaeX Ba AsCGa=aX=CP=e,a=$ Xoey C+eaC and Coed be FIRST(B a) be FIRSTE §) bes | Ig: goto(ly, a) | Caaecs | | Cora 1, kota" note one thing that Is and Ig are different because the second component in h dis different, “PPIY goto on d of Iy for 1 goto(l,, d) Csdes Now i I we ay *© Point in the rule C> ¢ d,$. ply goto on a and d of 1; we will get I; and ly respectively and there is Fepeating the states. So we will apply goto on C of by. TECHNICAL PUBLICATIONS™- An up thrust for Anowiest3e Scanned with CamScanner 5-6 More Po) Werf Re, ry otos. Applying g0tos on a and q point in applying 8 goto on Ig for C for the rule ot, For 1, and Is there is no 1 gives igand 1p respectivel): Hence we will aPPIY coaee $ _ Fay: gotolls. © | ovens and Ig we cannot apply goto. Hence the proces = completed. Thus the set of LR() items consists oy, For the remaining states In Jp f LR(1) items construction of set o! to I, states 1g: goto (la» ©) Mo’ goe8,8 s20ce$ sect, $ aerated Jo: goto (la ®) caedald caarc.§ coeat,$ 14: goto (lo. 5) Coeds gases 17: goto (Ip, 4) 1g: got0 (lg: ©) cde$ oar wat Coe, cH ace, ald 1g: got0 (ly) Ig: goto (Igy ©) coaeC,ald co ace$ co eal, ald Coed,ald 14:.G0t0 (lg &) Code,ald Construction of canonical LR parsing table A To i the construct the canonical LR parsing table first of all we will see 1 e exami algorithm and then we ‘will learn how to apply that algorithm oF sot parsing table is similar to SLR parsing table comprised of action and 8° pat Input; An augmented grammar GC. Output : The canonical LR parsing table. ee a, Scanned with CamScanner ead More Powerful LR Parsers poe Ho Bt ar snstruct set of items C = {ly 1,, Tyee) wi se for the input grammar G ms h sitll” pay ier actions are based on each item 1, The acti epee +8, bl is in I, and goto(, a) — » le table action[l,, a} = shift j, here C is a collection of set of Ons are as given below, I then create an entry in the ) there is a production S’— Ss, § in 1, then action|i, ¢ $1 = accept, no part of the LR table can be filled as : The gote transitions for state i is athe = for non-terminals only. If goto(l, A) = ons a the entries not defined by rule 2 and 3 ¥ then gotoll,, A} = Construct the LR(L) parsing table are considered to be “error” for the following grammar, IS 81 ysacc yea aco Solution : First we will construct the set of LR(1) items, y 1g: goto (13, C) S$ S+CC«,§ ean lg goto (1p, a) 7 Csa0,§ dad o - C-reac.$ i +48 | Vy: goto (Ip, S) C | SoSes | Ty: goto (Ip, d) Iz: goto (Ip, C) Code§ S+CeC§ Ig goto (Ip, C) C+eac.s Cate, ald Coeds Ig. goto (Ig, C) Ty goto (lg, a) eee % CoaeCald . Cac, ald Corda Ug: goto (Ip, d) Code ad TECHNICAL PUBLICATIONS- An up thrust for knowedge Scanned with CamScanner Stinnge & E A for the set of ttents ont do thawte as fastlowes, yy tore is a rates nate lity W AULA ate eaiiit | Sas The t Novw consider {y whe Cos oad ad and it the goto 8 apptiod on a then wo gett stato tog as lence create entry action|d) a abit & Similarly we Wey nh Cred ad Asa ead AeGacsaedpecbead gotoIy, dl \ hence action|0, d] = For state ly C > dea/d A >~aea A daca/d TECHNICAL PUMLICATIONS. "Up tus ar no Scanned with CamScanner _ rite Powerful LA Pesan reawee by CL bes rule a action[!, $] = accept, fOlO (1p, 5) 6 1, Q-® #010 (Ig, C) = Ly Bolo (1, C) = I, imtace gotolly, 8) = fh. Hence gotol0, 8} te LR(1) parsing table as follows, flu ae Action 0 3 sf The remaini “maining blank entries in the table are considered as syntactical error. sin ° the input using LR(1) Parsing table $88 above p; ‘arsing table we can parse the input string”aadd” as Action table Goto table action| 0,a action|3,a}=s3 TECHNICAL PUBLICATIONS™= An up thrust for knowledge Scanned with CamScanner Compiler Design action[3al shin % ‘S0a3a3d4 as action|4.d}=r8 Ree by $0a3a3C8 as action|8,d]=#2 Reduce we 4 | as action|Sd}=t2 Rede y ¢ «| ds action[2d]=87 Shitt + | | socad7 _s action{7,SI=3 lade by €-y, | goers So _action|5S}=t) _ Reduce by 56.4 [sot $ accept sy sd Thus the given input string is euccessfully parsed using LR parser or can. hal LR parser. (SEES Construct a DFA whose states are the canonical ing augmented grammar, collection of LR(1) items ¢, ESO BaBlb Solution : Initially we will start with S30, $ We will add the rules A > «BA and A the rule 36 toly. a] with $A, & a=s. Now let us map the rule [A >a °XB, Such that A=S, a=e, X=A, Bre, Then second component of A > ¢BA and A > * Will be = FIRST (Ba) = FIRST (e,$) = FIRST (5) = (8) S3eA,$ ABA, $ A+eS x pore ty ‘As there is a rule A> #BA we must add the rules Bea B and to compute second component of B'S rule we will use A. «BA, $ for mapping with [A > a *Xf,a] (Scarined with CamScanner "rererimen erence 5-99 a ert X=B, B=A, a=$ second component of B+ eaB and B+ ep wil be = FIRST (Ba) = FIRST (A,$) = FIRST (A) 2 {abs} = {ab} ‘ 3 poeaBab Bob. ab Hence Ip will be Ip, SOAS ABA S Awe BoeaBab B-seb, ab goto (Ip. AD S+0AS$ 1p) goto (Ip. BY A+BAAS ABA, § Boob ah Ty: goto (Ip, a) Barack ab B-seaB ab Beh ab 12 goto (1p. b) Babs ab Is: goto (13, A) AsBAe $ Seanned with CamScanner Compiler Design lt goto (1p, B) > 6? i BoaBe a/b | | | | ‘The DFA can be constructed as follows - Fig, 5.1.2 DFA [EG] LALR Parsing In this type of parser the lookahead symbol is generated for each set of item Ty table obtained by this method are smaller in size than LR(k) parser. In fact the sits ¢ SLR and LALR parsing are always same. Most of the programming languages use Lit parsers. We follow the same steps as discussed in SLR and canonical LR Parsing techniqu: and those are 1. Construction of canonical set of items along with the lookahead. 2. Building LALR parsing table. 5 Parsing the input string using canonical LR parsing table. Construction set of LR(1) items along with the lookahead _ The construction LR(1) items is same as discussed in section 5.1. But wee difference is that : in construction of LR(1) items for LR parser, we have differed ponents from both the states. " set we have got I; and Ig because of different ©. * We will consider these two states as same bY For example in section 5,1 components, but for LALR parser these states. i.e, TECHNICAL PUBLICATIONS". nua Scanned with CamScanner wet gotetle a) weranc/a/s oF cals cand 2/4/58 sa ss uke one example to understand the construction of LR(1) items for LALR esc C70 cod Construct set of LR(1) items for LALR parser. ation : First we will construct set of LR(1) items, ly Ig goto (Ip, C) S40S.$ { S-+0CC,$ — | Creat. a | le goto dt. ay | | Creda Co+mcs | Code$ | | | | = We wi “il merge states 3, 6 then 4, 7 and 8, 9. pees Scanned with CamScanner it tt 4g 90) eos $908 Breas gg GO ee ) Cor he C7 Cones C106 bg Oly) 9900, 2G Ct he BO C706 Os Sep G0 Vey 2) C189 BOE He have cergpa two sates by 2nd J, and made the second component as a or d ors The production rule will remain a it is. Similarly in I, and I;. The set of items consist states {Vay bay bey bas, Sez be bah Construction of LALR parsing table - ‘the alyorithen Sor construction of LALR parsing table is as given below. Step 1: Construct the S211) set of items, Step 2: Serge the two states I; and i i ir cere Ce Enc nis ‘With dots) are makching, and create 2 new state replacing one of the of state such as ty = It; Stop 2: ‘The parsing actions are based on each item J, The actions are as given bela aj IEA aa f, bl is in I; and gotoll, a) = I, then create an entry i action table action{S, a = shift j, bj If there is a production (A>, a] in J, then in the action table acim Sj, a)» reduce by A-vu, Here A. should not be S’. GS thete is 8 production 5+ «,§ in J, then action{i, $] = accept Btop A: The who part of the LY table can be filled tions for $8 f as : The goto transitions for 1 cmsidered for nemterminals only. If gotoll, A) wt then gotoll, Al=F ts a TECHYWAL PUDLICATIONS™- An up Wrst Sr trom Scanned with CamScanner ction conflict then dered to be “error”, Coa th . and grammar is not LALR(I), an pl algorith More Powertul LR Parsers *m fails to produce LALR Parser © entries not defined by rule 3 and 4 are Construct the parsing table for LAIR) paneer geistion + Pest the set LR(2) Hems can be constructed as follows with merged states. ly SH+S,$ S400. Cea, ald Coedaid Ty: goto (IS) SHSe$ Iz: goto (lp, C) SoCeCS C+e2C,$ Coed,$ 13g goto (Ip, a) Coaecaus Coeac, aas Coed, alas Taz goto (Ip, d) Code, as Ig: goto (Ip, C) S3CCe$ gg: goto (15, C) Co aCe aids Now consider state Ip there is a match with the rule [A— ot +a, b] and goto(I, a) = I, C-42C, a/d/$ and if the goto is applied on a’ then we get the state Is. Hence we wil ceate entry action[0, a] = shift 36. Similarly, bh C sed asd 4 sueaBb 42Caseacdp Po, da Fee ation, d] = shift 47 For state Ip, Sade ars ‘Ste, &,b=a/d wot dasa/a/s Xo = reduce by C > d ie. rule 3 7, “4 reduce by C+ d ie. rule 3 a Scanned with CamScanner 5-16 Mc Compiler Design we Power tn action[47, §] = reduce by C > die. rule 3 S3S$4$ink = accept. using the goto functions. So we will create action|1, $} The goto table can be filled by For instance goto(ly, $) = 1; Hence goto[0, S] = 1. Continuing in this fashion 7 #1) up the LR(Z) parsing table as follows. State Action string belonging to given grammar can be parsed using LALR parser. The bak are supposed to be syntactical errors. Persing the input string using LALR parser ‘The string having reguiar expression = atdatd € grammar G. We will consider isf# ng, 2s “2add” for parsing by using LALR parsing table. oe | Stack input buffer Action table fo table Parsing action | | aadds action{ Oa }=536 ——— cd adds action|36,a}=536 - 4 | | sens ads action}? | SAAT as action[A7/3]=136 —_[36,C]=89 { PWIA D9 os action{a9dJor2 136.0 }-89 Red | OAS a setion[A9sJo2 OC} : C2. 4% betion{2 Jews? A TECHNICA PUBLICATIONS ‘An up thnet tor krgnlodge Scanned with CamScanner we _ \ and LR parser will mimic one another on the same input ee LALR = Construct LALR parsing table for the following grammar : Aa 5 aes input string bde using table generated by you. ation : Let us fist number the production rules as below. saa ss bAc sae $—+bda 3 Avd Now we will construct canonical set of LR(1) items for the above grammar. hy: 36S S++Aa,$ S— «bAc, $ Sede, $ S—+bda,$ Aseda h on set of items we will start from S — S, ; After * the S comes, hence will add the rules derivin; $. The second component is $ g S. Now we have got the S—+Aa,$ *sembling with A >a «x B,a . ion we iy, &" Map x to A then the second component of X — #§ FIRST Ga) Our rule TECHNICAL PUBLICATIONS” An up thrust for krowiedo® Scanned with CamScanner Ton SU ' A. «@ and second component is FIRST (6a) = a oe Hence A - +d, a will be added in To. + goto (10/5) = . . - We will carry second component as it is. 1g + goto (Ig, A) S > Aea$ 1g + goto (Ig,b) Sa beAc §$ S— beda,$ Adc : S beAc$ and FIRST(Ba) = FIRST (c§) = c- Hence second component of A — d is ¢ 1g + goto (Ip, 4) Sa deg § As dea goto(l2, a) S Aae,$ Ug = goto (13,4) S bAsc S$ 1y = goto (13,4) Sa bdea,s Aa dec Tg + goto (4,0) SH des,§ Ty : goto (Ig,0) S— bAce,s Tyo : goto(ly ,a) S— bdae,§ = a In above set of canonical items To states are havi tion rules HE, Wwe cannot merge these states. The sa ‘ing common product ld Me set i i i for LALR parsing table, Set of items will be considered TECHNICAL PUBLICATIONS”. An verse Scanned with CamScanner More Poworful LR Parsers eng DO , “yconstrct LALR parsing lable using, following rules. We ena, b]is in Ty and goto (1), a) = 1) then action [i, a] = Shift j. lf ee jg.a production [A ~> +, a]in some state I; then 2 pa {i,a] = reduce by A 0%. a product > S+,$ in I then action fi, $] = accept. yr there fs ~ Action rs - { _ | rR | 10 Consider the input "bde" for parsing with the help of above LALR parsing table. Stack Input buffer Action bo bas “snit3 $063 ; cs shift 7 S87 3s __Reduce by A> d J ae Ss Shift 9 ee |_S0b3A6c9 $ 81 in A 1 Thus the input ei, ‘he input string gets parsed completely. Show that the followin; Ig grammar. $> Aal bAc| Bel ba § And Bog "IRA but not LAL (1) [ee TECHNICAL PUBLICATIONS”. An up thust for knowledge Scanned with CamScanner vrais Qasr Solution ¢ We watt mmbor out the prostuction res i given grammar Se, . = Yo Ses Aa 2 Sabre » sak Y Same canonical set of LRU) items, cgmentat grammar the second component is § Ag y A> eX Ba] where at iss, X is § Biscandant on $ and add all the production rules deriving § Te SW all be $. This is because for [S’ > slitepe = FIRS = $ Hence we will get * SasbR Now we have got one rule [S + «Aa, $] which is matching with [A > a B, alan [X = y). Hence we will add the closure on A, Hence [A= +d] will be added in rial Now the second component of A — ed will be decided. As. [S > eAa, S]is matching wth [A > A+X 8, aj having X as A. Bas a and aas $ | Then FIRST Ga) = FIRST (a$) = FIRST (@) = a. Hence mile [A> +d a] will added. Now, Sa-3$ Sa+Aa$ SebAc$ eg Scanned with CamScanner oe ie S > eB, & It suggest to apply closure on B (As after dot B i “ne ; now PotGjately) This rule is matching with [A —> a eX fi, a] and [X ~» 1]. Hence we oor me deriving B. Hence B +d will be added in the above list. Now 3 ce « se $can be mapped with [A > aX B, a]. a=e 57 wee oe pec ass since FIRST (a) = FIRST (€ 8) = FIRST (=e vance rue [B > 24» ¢] will be added. Hence finally Hi S388 S3+Aa,S $3+bAc, $ $+ Be $ $+ bBa, $ Aveda Boedc Ip will be - Continue with applying goto on each symbol. There is no chance of applying closure or goto in this state. Hence it will have only one rule. 80t0 (Ip, A) Ss Ava Applied goto on A. But as after dot a terminal symbol comes we cannot apply any rule further. The second component is carried as it is. £8910 (Io, b) After dot A comes hence rule for A > ed will be added. SsbeAacs| As [S > beAc, $] is matching, with 5 bebe, g | [A aX 8, a and X > y], FIRST @ a) = FIRST (s Aad FIRST (c) = c, The second component of A — «disc. Boda Scanned with CamScanner Lewy nystor Des? . MUR, ] » goto on Bis applied and second: ¢ Egoto(ioek) TR “Mone, {carried as it is $4 Bec, $_J The goto on d in state Ty is applied with corre, | Ig = got (ld) L soto (13, d) So Ave, Aw. desis : goto (Ip, a) ly Boade,a Tio + goto (ly, c) S> Bee, $ Is : goto (Iz, B) Th = goto (Iz, c) S$ bBea, $ S> bAce, $ ———__ Tyg + goto (Ig, a) ‘ “i second components as it is, Metin B S— bBae, $ Now using the above set of LR(I) items we will construct LR(1) parsing table follows. i } - Action Goto 1 / 2° pos ee | | : | 4 | f | | | i TECHNICAL Bi om. Scanned with CamScanner Kathe deh des 5-23 More Powerful LR Parsers wwe can parse the String "pda" using above constructed LR (1) parsing table as : Input buffer Action ey bdaS _ Shift 3 | daS _ Shift 9 t aS Reduce by B > d | aS. Shift 12 | s Reduce by $ > bBa_ $ Accept Now we will construct a set of LALR(1) items. In this construction we will simply mege the states deriving same production rules which differ in their second ssxponents only. In above set of LR(1) items state Is and Ig are such states ie. goto (Ip, d) Ig : goto (I3, 4) Asda Av de,c Bodec Bods,a We win (“Swell form only one state by merging state 5 and 9 as “3 goto (Ip, d) Asda/ec Bade ase Scanned with CamScanner Se Fi ~ SRE La i Hence the LALR(1) set of items are given as : fs, Ip: S3°S,$ Ig : goto (Iz, a) S—+Aa,$ S— Aae,$ S— +bAc,$ I; : goto (Iz, A) S— + Be, $ S> bAcc, $ Aseda Is : goto (13, B) Boedc S— bBea, $ Ty + goto (Ip, S) lio : goto (14, 6) Si Ss,$ S> Bee, $ 1, : goto (Ip, A) In + goto (7, 0) S Asa $ S— bAce, $ 13 + goto (Ip, b) Tz : goto (Ig, a) Sm beAc § S— bBa+, $ S— b-Ba, $ Avedec : Beda Ig + goto (Io, B) S> Bec $ Ig : goto (Ip, d) A> de, a/c : Bo de, a/c The LALR parsing table will be - eae Action = L © a s s B ACCEPT 1 ZB sil TECHNICAL PUBLICATIONS”. an up thrus for knowledge Scanned with CamScanner— Moro Powerful LR Parsers le shows multiple entries in Action [59, a] and Action (59, c}. This is ing, tl ‘ ie pane conflict. Because of this conflict we cannot parse input. given grammar is LR(1) but not LALR() cipis shown that § gpl comparison of LR Parsers isa time to comp: “CFG, elliciency an are SLR, LALR and LR parser for the common factors such as size, d cost in terms of time and space. LALR parser Canonical LR parser ‘The LALR and SLR have LR parser or canonical LR the same size. parser is largest in si |1 Itis an easiest method ‘This method is applicable to This method is most hased on FOLLOW function. wider class than SLR. powerful than SLR and LALR. This method exposes less Most of the syntactic This method exposes less syntactic features than that features of a language are _ syntactic features than that _OFLR parsers. expressed in LALR. of LR parsers. ‘tog . oe arr ~ — Exor detection is not Error detection is not Immediate error detection is \_immediate in SLR. done by LR The time and space The time and space Space complexity. complexity is more in LALR complexity is more for but efficient methods exist. canonical LR parser. ke for constructing LALR Scanned with CamScanner MUR py Pay Graphival yoprosiontation far the elias at ERC family: bas given below, J ian LAUR SLR Fig, 6.3.1 Classification of grammars EXCY Dangling Else Ambiguity roy i grammar is used then the conflicts occy 8 occu ty the parsing methods if the ambigue and thers we can not parse the input string, 1é the two entries that appear in the parsing table M[A,a] are for reduce action reduce conflict occurs. then reduce © VWoone entry is for shift action and another for reduce action in M[A,a] then shift-reduce contliet occurs, Using dangling else ambiguity Consider the grammar iS fa = if expression then Statement else Statement if expression then Statement a = all other productions We will have this grammar in this manner - S48 Sr iSeS | iS | a Now we will build the LR(0) set of items for as Scanned with CamScanner More Powerful LR Parsors $3: goto (1, a) Srae 1g: goto (15, $) SiS ees Ss iSe 15: goto (Iy, @) SsiSees So +iSes Sis Tp: goto (Ip, i) S+ieSes SoieS S$ *iSes SiS Sea The FOLLOW(S) = {e, $] .Now we will build the SLR parse table for above obtained qofitems. “layin the above table at action{5, e] there is a shift/reduce conflict. We will now to resolve it, Scanned with CamScanner Compier Design 5-28 Consider the input “iiaeaS” for processing, ———_—— ———— r — 7 - — | Stack Input Action with conflict resolution | eS shift | i - J | so Shit : | | soz sexs Shift. i2i2ad aS Reduce S—+ a _ | sniziass aS From the conflict we have chosen i Reduce S-+ 1S | | | $0i2S4 aS Reduce S-» iS - | i ‘That means the choice of 12 in action[4.e] is not valid. Hence we will try itty choosing the shift action. Stack Input Action with conflict resolution ” ___iiaess Shaft son meas Shut soi2i2 seas Shut 0i2i2a3 ea$ Reduce soni2st ¢a$ From the conflict we have chosen | Reduce $9 15 SDi2IDSHeS a8 Shift an SOi22S4e593 $ Reduce Sa S022 Reduce $+ 5S - s soi2s4 [ ses s a Logically also we should favor the shift operation as by shifting the es¢ wo associate it with previous “if expression then statement”, Therefore shift/reduce is resolved in favour of shift. TECHNICAL PUBLICATIONS”. An up thust fork Scanned with CamScanner resolving the conflict we get the parsing table for dangling else problem as crus bY Joacicrascesibait tte im Sox tealons State Acti ry} a HO goto | overy in LR Parsing ga Error Rec! se LR parser is 2 table driven parsing method in which the blank entries are treated seemors When we compile any program we first get the syntactical errors. These errors a wsualy denoted by user friendly error messages. To understand how to decode these seme messages consider one example. EE+E EsE*E E+€) Eid The parsing table for the above grammar will be - ‘State Action Goto | i i ae) s EB | | 4 hn on tenet for knowlege Scanned with CamScanner Compiter Design During the error detection and recove entries by particular reduction rules. This ‘ Pos nk detection until one or more reductions are done. [This is how we re oeting tf ay down! and error will be introduced before any shift move take place. “88g the " Consider the set of items generated for obtaining the error messages Ip: Eek It means that there is no symbol before the dot. And being the initial g is empty. In such a case if + of * oF $ comes in the input string thes a operand is missing for these operator. In other words first some operand pv Sy ty and then this operator should appear. ‘ould Pree | ————_— Stack Input Error could be | | $ __ Missing operand | Is Missing operand | | $ Missing operand Unbalanced right 1 : goto(|p, E) E>Ee If we ultimately reduce E — id then id id = missing operator, id( = missing operate, id ) = unbalanced right parenthesis. Stack Input —_Error could be id Missing operator Missing operator Unbalanced right parenthesis Ig: goto(ly, E) E+ (Ee cans af If we eventually reduce E -> id then the rule becomes E —> (ide). That mer sy id we expect ) and if § comes then the error will be missing right parenthesis after id again id or ( comes then it will be missing operator. tee eet TECHNICAL PUBLICATIONS”. An up thrust for knowledge | Scammed wb CamScanner ke Ertor could be Missing operator Missing operator Missing right Parenth —__Parent From all these situations we conclude some CrTOr Messages are ; + EL: These errors are in states Iy, Iy, ly and 1 should appear before operator. Hence the operand”. Is. This. indicates that the operand SOF message will be “missing + £2: This error is in ) column and from unbalancing in right parenthesis. Hence or parenthesis”. states Ip, Ty Ty Iy and I, indicating Tor messa Be will be “unbalanced right +E}: The operator is expected in this case of error a8 it is from state 1, or Ig. The error message will be “missing operator”, + E4: This error occurs at state 6 in the $ column, at the end of expression. Hence the error parenthesis”. The state 6 expects ) parenthesis message will be “missing right Thus the modified table with appropriate error messages is as shown below. TECHNICAL PUBLICATIONS”. An up thrust for knowledge Scanned with CamScanner sf sew w mu he parsing table if it des ut, LR parser will reter t 1 fe will be reported This is how we gq x ty Dective error mess entry then re when we compile our program: [EG] Automatic Parser Generator vie have discussed the manual method of constraction of LR parser, This invalyg ke of work for parsing the i Hence there is a need for automatioy nh OF this process in onler fo achieve the ficiency w@ parsing the inputs Certain automaton yyy lable. YACC is one stich automatic tool for generating jy ere for parser generation are ava Js for Yet Another Compiler Compiler which is basically parser program. YACC stare ty YACC is LALR parser generator. The Yace utility available from UNIX. Basics ~ report conflicts or ambiguities (fat all) in the form of error messages, In earlier chan vee have seen one such tool LEX for lexical analyzer, LEN and YACC work together ip analyse the program syntactically. The typical YACC translator can be represented as shown in Fig. 5.6.1 Specification file ytab.c. and ytab.c cout CC ~ The C compiler Input strin, —_— Executable program aout Fig. 5.6.1 YACC : Parser generator mode! First we write a YACC specification file; let us name it as xy. This file is given? YACC compiler by UNIX command - yace x.y Then it will generate a parser program using your YACC specification parser program has a standard name as y.tab.c. This is basically parser Pros generated automatically. You can also give the command with - d option “ file. ™ ram i Scanned with CamScanner 5. esse SS a Move Poworful LR Parsors _ yace = a yy 4 option two files will get generated one is yatabye tab.h will store all the tokens You nee B jet file ys peat ie ghe generated executable and invalid and other is y.tab.h. ‘the Mt not have to 6 "be compiled by ren YOU can test your YACC dsc y.tabc program will the put file. Th strings. y.lab.h compiler and ram with the help eaplcitly eeratS the sore vat a j wrriting YACC specification program is the most log This specification fle the context fee grammar and using the production rut of context free a the parsing of the input string can be done by y.tabre erat pre ical activity, et us lean how to write YACC program, FXII YACC Specification Fist of all we will see the sucture of YACC specification, The YACC specification file consists of three parts de ‘claration section, translation rule ection and supporting C functions, Declaration section (Ordinary C deciarations) Translation rule (Context free Grammar) ‘Supporting C functions Fig, 5.6.2 Parts of YACC specification The specification file with these sections can be written as q 1 delaation section */ % ‘ , Translation rule section */ 1 Recuited ¢ functions*/ ; ; y this | Declaration part: in this section ordinary C declarations can be put. Not ony ; ane jokens han 280 declare grammar tokens in this section. The declaration © should be within %{ and %), “ 1 nn, TECHNICAL PURLICRE. him tO AO 8 A aces Cam Scanner Comyew Dar aw — "Oy Mon Yor histance < ratte t action t ratte 2 action 2 | i rule 1 action ane more than one alternatives toa single rule then those attornay , tive sy J character, The actions are typical © statements, 1 CHG jg nay patent b .Jalternative n alternative 1 [alternative 2 alternative I action I | alternative 2 {action 2) alternative 1 faction n} is of One main function in which the 3. C functions section : This section con: routine yyparse() will be called, And it also consists of required C funetions, Programing Example ; Write a YACC program for implementing desktop calculator, Program : We will first create LEX program named calci.l then we will write progam for YACC as caleiy. The extension to LEX program is I and to YACC program is .y {*Program name :calci|*/ #include “y.tab.h" /*defines the tokens */ #include %) KK {*To recognize a valid number*/ {10-04 + }{40-9]°\ 10-9]+ )(LeE-+1716-914)2) — {yylval.dval = atof(yytext): return NUMEER;} ‘ot log no | LOG no (log base 10)*/ log | LOG {return LOG)) /*For In no (Natural log)*/ {return nL.OG;} ae In Scanned with (amScanner pFor sin angle*/ sin | ; gin {return SINE:} /*For cos angle*/ cos | cos {return COS;} /*For tan angle*/ tan | TAN {retum TAN;} ("For memery*/ mem {return MEM;} {\d ; /*Ignore white spaces*/ /*End of input*/ \$ {return 0;} /*Catch the remaining and return a single character token to the parser*/ in]. return yytext/0); ¥ /*Program Name :calci-y */ a double memvar; } [To define possible symbol types*/ anion { ; double dval; /*Tokens used which are returned by lexer*/ token NUMBER ‘Aoken MEM ‘token LOG SINE nLOG COS TAN /"Defining the precedence and assogiativity*/ TEGHNIGAL PUDLICATIONS + An up thus for knowkaigo Scanned with CamScanner ~~ Compiler Design edence*/ —_—— /rLowest Prec soft oF cqtott 7 aright '°" 0 mel L0G SINE nLOG C s TAN /*Highest precedence / yf "No associativity’ < aunt fi on UMINUS y*Unary pests the type for non-terminal*/ satypo expression *% . /*Start state*/ start: statement ‘\n’ | start statement ‘\n /*For storing the answer(memory)*/ statement: MEM '=' expression {memvar = $3;} | expression {printf("Answer = %g\n",$1);} ; "For printing the answer*/ /*For binary arithmetic operators*/ expression ‘+' expression {$$ = $1 + $3;} | expression " expression {$$ | expression '*" expression {$$ | expression '/' expression { /*Tohandle divide by zero case*/ expression: = $1* $3;} i($3 == 0) yyerror("“divide by zero"); else $$ = $1/$3; | expression '*' expression {$$ = pow($1,$3);} /*For unary operators*/ expression: expression %prec UMINUS {$§ = $2;} /“Sprec UMINUS signifies that unary minus : cr ee pea minus should have expression ‘) $ , | LOG expressio: ieee nm {$$ = 1 ' | expression see Matton 1 Gatisnometric functiongty “ /°8'S2H} 3.141892654 / 180);} 3.141892654 / 180);} 3.141892654 / 180);} Techn NAL PUBLICATIONS”. An up ny tor us for knowledge Scanned with CamScanner copier Desion NUMBER {83 = $13 MEM {S$ = memvar;} [Retrieving the memory contents*/ ee main() print{("Enter the expression: yyparse(): } jnt yyertor(char *error) { fprint{(stdert,"%s\n" error); } Compiling and running of LEX and YACC programs The output of the program can be obtained by following commands {root@localhost|# lex calci.1 ; root@localhost|# yace -d calci.y Note that if we use ~d option {root@localhost}# cc y.tab.clexyy@-ay am ———| then tabsh gets created {root@localhost]# ./a.out automatically and we need not have Enter the expression : 242 x Srosta explicitly Answer = 4 How to run lex and Yacc Programs together ? eae lex calci. 1 — will create lexyy.c 2528 yacc - d calci.y will creat y.tab.c anew cc y.tab.c lex.yy.c ~ ll - ly ~ lm will compile both lex.yy.c and y.tab.c mem = cos 45 ae sin 45 / mem By Jenna various or Ubrary file ino /a out < will run executable file of program ‘Answer = 2.30259 SEE er sites) Declaration part ‘*étoken is used to declare tokens in the YACC. In the above program token NUMBER can be declared as “btoken NUMBER The precedence and associativity can be declared in the above program “oleft, Yoright and Yononassoc declares the associativity of the operator being, left associative, right associativ pee “sociative or nonassociativity respectively. The precedence can be declared in an increasing order. TECHNICAL PUBLICATIONS”- An up thst for Knowtedge ~ Seamed with CamScanner tka The yylval used is the union of type double. YACC can associate data type wth token by using this yylval. satoken NI Ut means token NUMBER has the data type double. Any number of data types cay, declared in union 7 sociate*/ cleft LOG SINE nLOG COS TAN //*Highest precedence*/ PNo sociativity"/ Senonassoc UMINUS /*Unary Minus*/ The operators on the same line has equal precedence. For instance ‘+’ and ‘~' has the same precedence. ‘The shift/reduce conflict can be resolved by YACC with the help of these precedence rules. If the input token ‘id’ has more precedence then the shift action will be periormed. If the precedence of production rule Ara is greater than ‘id’ then reduce by A-a will be performed. If there is same precedence then YACC checks for associativity: associativity shift action will be performed and for right associativity reduce action will be performed. Rule section im the rule section : (colon) is used to separate LHS and RHS of production rule. Te termination of each rule is given by * ; . The SS is used as attribute value at LHS of grammar. 1 there is a rule E+E then it has attribute values as $1483. Since E = §1, + is RO"! for $2 and E = $3. To represent the RHS of the grammar $1, $2, ..., $n symbols are US Finally the answer can be in $$ = $1, Oe OES Subroutine section In the main important function yyparse() is called YACC invokes tirst yypanel which inturn calls yylex when it requires tokens. The routine yyerror is used to print the error messay reed parsing, of input. Thus we have seen how an input string is parsed and syntactically checked. se when an error is occtl TEGHIVCAL PUBLICATIONS”- Aap that bo knominsge Scanned with CamScanner ve the f go deve t a Wing PAL EE gry Method we have to v6 lection of items using LR) items e ag LRA) items are sess 1, + goto (I, =) 5aeL=RS S+L=ers SaeRS R-Ls LaeR= |S LoeRs Leid, =|$ Lids Ra6L S$ i: goto (Ip, S) Ty : goto (I;, R) Secs L «Ree Is ke goto (Ip, L) Is : goto (Iy, L) SoL+=RS RoLe=|s RaLe,$ I, + goto (I, R) S3L=Re,$ 4 goto (ly, R) & pooag he * goo Ly baeeRe 5 1, : goto (I, *) LoeeRS Ro-L$ LoeeRS Leveid $ Typ: goto (ly, id) 4 Loides 1y 5 goto (ly ®) Love Ra TECHNICAL PUBLICATIONS”. An ve trust fe Mee Seared with Cam Scanner 5-40 Mor Power, cermin From above set of items we have got ~ i) I, and Ij, give same production but lookheads are different, Here re, to form Lin Hence I512 Hence Iny3 iv) Ig = Io Hence Ig1o Therefore the set of items for LALR are he ga-S$ Ing tL o*Re= 1S | So+L=RS$ S2°RS$ IggiR— Le, = |S LoetR=I$ Lid, =|$ b:S> L=Re$ | Ro -+L,$ j I: S3S8+,8 I: S>L+,=R$ &: In Is L—eld,= |S Ip Laid Ij: S3L=-R$ ets Ra-L$ Lo. ias The parsing table can be constructed as follows - | Action Goto | T ~ —_ ~ | | fe Scanned with Cam Scanner The LALR parsing table is Action Goto . s L R went 1 2 2 ea — 80713 es a eal | 810 9 Ld | | 82 Construct LALR parsin; S3CG, Ceca Ans, ; LAE Beer example 5.2.2 assuming a = ig table for the following grammar TECHNICAL PUBLICATIONS". An up unis or owledge Scanned with CamScanner | | | More Powe 5-42 ‘erful LR p, ers Compiler Design in shift - reduceparse, rser are i) Shift reduce ‘ommon conflicts that can be encountered Q.3 What are the c ur in shift-reduce pa Tip Ans. ; The common conflicts that occ! ii) Reduce-Reduce. Example : Refer section 5.4. Q4 Explain canonical LR parsing. Ans. : Refer section 5.1 Scanned with CamScanner Semantic Analysis Syllabus | semantic analysis, SDT, Evaluation of semantic rules | Contents 6.1 Introduction 6.2. Syntax Direction Translation (SDT) ......... January-10, May-08,09, Set-3, yee tee ae - Aug./Sept-08, Set-1,3, «Marks 16 6.3. Botlom-Up Evaluation of 8-Altributed Definitions . Dec-05, Set-2,3, ae . May-05,09, Set-4,------- Marks & 64 Attributed Definitions... 0... eee ... danuary-10, Set-2, ssaenoneees .... Aug./Sept.-06, 08, 07, Set-1, neat neeevenerberve May-06,07,Set-1,3,4, --- Marks 16 6.5 Bottom-Up Evaluation of Inherited Attributes 6.6 Recursive Evaluation Scanned with CamScanner As we know how an input string is syntactically checked by the syntax analy should now know how to analyze the input semantically. In semantic analysig (We analyzes the meaning of the program. Beyond syntax analysis, Programming eqn. fare analyzed. Hence extra-syntactic rules are imposed in this phase. The mt analysis is carried out by scanning the input text and it is static in nature n properties that can not be captured by context free grammar are captured by se S rules. ‘These properties could be name, scope analysis, type checking “ang te conversion. 7 (ORD Need of Semantic Analysis The semantic analysis is done in order to obtain the precise meaning programming construct. This phase is also known as static semantic analysis, The nog for semantic analysis firstly, is to build a symbol table that keeps track of names established in declarations. Secondly the data type of identifier is obtained and ty checking, of the whole expression and statements is done in order to follow the type rules of the language. Thirdly to identify scope of identifiers. (EE) syntax Direction Translation (SDT) While doing the static analysis of the language we use syntax-directed definitions That means an augmented context free grammar is generated. In other words the set of attributes are associated with each terminal and non-terminal symbols. The attribute an be a string, a number, a type, a memory location or anything else. The syntax-directed definition is a kind of abstract specification. The conceptual view of syntax-directed translation can be as shown in Fig. 6.2.1. vata wpatsiog Le] synaee Lomf Povendoney L | Sanerigh [semantic rules| Fig. 6.2.1 Syntax-directed translation Firstly we parse the input token stream and a syntax tree is generated. Then the is being traversed for evaluating the semantic rules at the parse tree nodes. The implementation need not have to follow pass implementation semantic rules can be evaluated during parsing without expletl constructing @ parse tree, or dependency graph. In such semantic evaluation, 3 nodes of the syntax tree, values of the attribute are defined for the given input string Such a parse tree containing the values of attributes at each node is called an annt! or decorated the parse tree. e all the steps given in Fig. 6.2.1. In si" TECHNICAL PUBLICATIONS". Anup tas for knowledge Scanned with CamScanner Siemantic Analysis: ion : Syntaxcdirected definition is a jeneralization of context free yam pane sociated with Ita set of semantic rte foam a QB peBa eee), where a isan ateute obtained f atyibute # The attribute can be string pel mar production Xv p the function £ Number, a type, a memory location or or X-vee be a content free grammar and ase t(bysby cooly) wh Conside thon there are two Lypes of attributes attributes 1. Synthesized attribute : The attribute ‘a’ is called synthesized! atteibute of X and by byreenb§ ae attributes belonging to the production symbols ‘The value of synthesized attribute at a node is compu attributes at the children of that node in the parse t 1 fram the values of therited attribute of one of the inherited attribute : The attribute ‘a’ is called grammar symbol on the right side of the produ belonging to either X or a « fon (ie. 2) and by by oooby are The inherited attributes can be computed from the values of the attributes at the siblings and parent of that node. STEEL 4. synthesized attribute died dix “Let us see how to compute synthesized attributes. wonis fer Peace, PEER: Conser the context free grammar as SEN con E+E+T E+E-T ‘anu to he brains ‘wa? Ze TECHRIGH, PUDLIGATIO Scanned with CamScanner semantic actions for 6 - Seman a definition can be written for the above Stammay, b definition © rt production The syntardirected 1 Semantic actions jon rule Production 7 Print(E.val) } > EN - Eo —E oo Boo tT 9 Io Too Foo an | Fos | N = : Can be ignored lexical analyzer as ji { terminating symbol. ee ee) For the non-terminals E, T and F the values can be obtained using the attribute ‘wi. Here “val” is a attribute and semantic rule is computing the value of val. (How? that will discuss shortly!) The token digit has synthesized attribute lexval whose value can be obtained t* lexical analyzer. In the rule S + EN, symbol $ is the start symbol. This rule is 0 F* the final answer of the expression. In syntax-directed definition, terminals have synthesized attributes only. wot * Thus there is no definition of terminal. The synthesized attributes are quit ef used in syntax-directed definition. The syntax-directed definition that "5° synthesized attributes is called S-attributed definition. anne * In a parse tree, at each node the semantic tule is evaluated for (computing) the S-attributed defi fasten? inition. This processing is in bottom uP ** from leaves to root. Following steps are followed to compute S-attributed definition. oe 1. Write the syntax-directed definition “ using the appropriate semantic corresponding production rule of the given grammar. TECHNICAL PUBLICATIONS". An un tne ‘Séaiined with CamScanner paee 6-5 Somantic Analysis 2. The annotated parse tree is generated and attribute values are computed. The computation is done in bottom up manner. 3 The value obtained at the root node is supposed to be the final output. Let us take an input string for computing the S-attributed definition for the above ven grammar. sample : Construct parse tree, syntax tree and annotated parse tree for the input tring is. 5 * 6 + 75 oa Ter at ak [ \ 7 | ® digit ® 6) (2) Syntax tree (b) Parse tree Value sto, ~ ‘ from child,” " to parent | Ev Eval = 30 27 Tal = 30 Fval=7 ‘ fo \ Tal=5 + Fval=6 ——digitlexval=7 ’ Fval=5 —— digitiexval=6 digit toxval = 5 ? (c) Annotated parse tro Fig. 6.2.2 Computation of S-attributed definition / TECHNICAL PUBLICATIONS” An up thus! for knowledgo Scanned with CamScanner Compler Design 5 tation of art from the leftmost bottommog ov aes 3 in order to reduce digit to F. The semantic action 44 ae F > agi mw lewval. The value of digit is obtained from lexicay here is Fal: = digit-lexval. parser invokes the lexical analyzer 10 § attributes we st od at take anal got the token value) which becomes they 5 Ale oy, Hence Ewval=5. a since T is the parent node of F and semantic action suggests that Twva} x Since T is the pa ae is aly the T.val=5. Thus the computation of S-attributes is done from, children, We can ge a S ‘Then consider T — T, * F production; the corresponding semantic action js Tyal = Tyval x Bal. Hence LR Twal = Tyval x F.val if = 5*6=30 4 Similarly, The combination of Ey.val + T.val becomes the E node, 2 4 . Eval = Ejval + Twal fr = 3047 f Eval = 37 Here we get the Ey.val from left child of E and T.val from right child of E, Finally ye acquire the value of E as 37. Then the production S > EN is applied to reduce Ewal = 37. Then rule N->; indicates termination of the current expression. The semani: action associated with S > EN suggests us to print the result E,val. Hence the output will be 37. Thus S-attributed definition can be computed by a bottom-up fashion using postorder traversal. 2. Inherited attribute The value of inherited attribute at a node in a parse tree is defined using attribute values at the parent or siblings. Consider an example and let us compute inherited attributes. Annotate the parse tree for the computation of inherited attributes for the §** } string: int a, b, c; the grammar is as given below. if SoTL T > int T > float T ~ char j T ~ double Ls Ly id i L > pid. ch ; TECHNICAL PUBLICATIONS”. An op ive or knowledge Scanned with CamScanner sate itt bc we Rave to dstbute the datatype into al the identifiers 3 . such that a becomes integer, b becomes integer and ¢ becomes integer. eps ae to be followed : | polo 1. Construct the syntax-directed definition using semantic action. annotate the parse tree with inherited attributes by processing in top-down fashion. 2 ‘he syntaxcdirected definition for the above given grammar is | Production rule as | Ss > TL rare Hi | T = int Typeinteger | h | T + float Tiypestoa ‘AI | To -+ char Taypeschar | | T + — double Ttype-=double i | | Loo Lyid Ly.in= Lin i | Enter_typet id.entry, Lin) 1 L } Q 2 : i i 5 1 Type = int Value obtained / ; from child { | Value obtained opment from sibing / l \ | © Value obtained / from parent {! tochid | Fig, 6.2.3 Annotated parse tree TECHNICAL PUBLICATIONS” - An up thrust for knowledge Scanned with CamScanner The value of L nodes is first obtained from Serscal value obta of the identifiers Preon insert v ned as int or float or char or double. Then the |, nodes ti . and c. The computation of type is done in Lop-donm ne et traversal. Using function Enter_type the type of identifiers ab ty in the symbol table at corresponding, id.entry (The identry jg the PR & comesponding, identifier in the symbol table) . te, 3. Dependency graph The directed graph that represents the interdependencies between ‘iMthesing ishented attributes at nodes in the parse tree is called dependency graph, ag For the rule XYZ, the semantic action is given by X.x := f(Y.y, Z.2) then Synthesized attribute is X.x and Xx depends upon attributes Y-y and Zz, Algorithm for Constructing dependency graph for (each node n in the parse tree) do 4 for (cach attribute a of grammar at node n) do \ { for the attribute a, build a node of dependency graph. 11 Constructing all the nodes of graph, 4 for (cach node n in the parse tree) do for (each semantic rule b : = £ (e, , Cap Gy) which is associated with Productions) do for (i := 1) to k) do Consturct edge from node ¢; to b, Design the dependency graph for the Soltowing grammar. 7 Scanned with CamScanner je semantic rules for the above grammar is as given below. sotutio" th { Production rule nantic rule Eo Eek, Eevakatly evalet eval Eo EYE valet eval X By eval graph is as shown in Fig, 6.2. ATS AS, Eyval * — Ezeval The dep Fig. 6.2.4 Dependency graph ed attributes can be represented by eval. Hence the synthesized ven by Eval, Ey val and Eval. The dependencies among the nodes is © The arrows from E, and Ey show that value of E depends upon arse tree using dotted lines. synthesiz sxtributes are We have represented pi jgn the dependency graph for the following, grammar. Taint T= float T-cur T = double List + Listy, id List > id Solution: The dotted line is for representing, the parse tree. The semantic rules for the above grammar is as given below. Semantic actions Listini=Ttype ‘Taypessinteger TECHNICAL PUBLICATIONS”- An up tina for anole? Scanned with CamScanner Amt ae 2 tN Ad. a float > Raypes Taypesschar T > char 1 — double “Paype=double List -> List), id Listin Parent Us to , child From sibling a Fig. 6.2.5 Dependency graph ‘The dependencies among the nodes can be shown be solid arrows. In the above drawn dependency graph how the values can be inherited from the parent oF sibling node is shown clearly. Hence the name for the attributes is inherited attributes. 4, Evaluation order | ration order in 2M ed ‘Thereo® The topological sort of the dependency graph decides the evalu tree. In deciding evaluation order the semantic rules in the syntax-direct are used, Thus the translation is specified by syntax-directed definitions. precise definition of syntax-directed definition is required. jee Scanned with CamScanner conpierDesion Semantic Analysis ———_r_ayais T. a ---=Uistin= i @ @isuinvin +; i in S® J Fig. 6.2.6 Evaluation order The evaluation order can be decided as follows, 1. The type int is obtained from lexical analyzer by Y analyzing the input token. 2. The List.in is assigned the type int from the sibling Titype. 3. The entry in the symbol table for idi lentifier ¢ gets associate Hence variable ¢ becomes of integer type. 4 The Listin is assigned the type int from the parent Listin, 5. The entry in the symbol table for i Hence variable b becomes of integer ‘d with the type int, identifier b gets associated with thi e type int. type. 6 The List.in is assigned the type i 7 The entry in the symbol table for identifier a Hence variable a becomes of integer type. Thus by evaluation the semantic rules in this order stores the type int in the symbol “ile enty for each identifier a, b and c. QI construction of Syntax Trees The syntax tree is an abstract re tees are Presentation of the language constructs. The syntax used to write the translati See hy ion routines using syntax-directed definitions. Let us "ow to construct Syntax tree for expression and how to obtain translation routines. iy Construction of Syntax Tree for Expression Srammar considered for the expression is ESEatT Ess, pp TECHNICAL PUBLICATIONS” An up thrust for knowledge Scanned with CamScanner Compiter Design oe E> EYT EOT T > id T > num Constructing syntax tree for an expression means translation of ’xPression postfix form. The nodes for each operator and int operand is created. Each node can be implemented as a record with multiple fields, Following are the functions used in syntax tree for expression. 1, mknode(op,eft,right) : This function creates a node with the field operator having operator as label, and the two pointers to left and right. 2, mkleaf(identry) : This function creates an identifier node with label id and a pointer to wn num symbol table is given by ‘entry’. 3. mkleaf(num,val) : This function creates node for number with label num and val is for value of that number. ESSE) Construct the syntax tree for the expression xy-542, Solution : Step 1: Convert the expression from infix to postfix xys5-2+, Step 2: Make use of the functions mknode(),mkleaf(id,ptr) and mkleaf(num,val). Step 3: The sequence of function calls is given. Postfix expression xy5 ~ z+ Operation Premkleaflid, ptr to ent try x) _¥____Pa= mkleaf(id, ptr to entry y) Ps=mknode(*p,,p2) Ps=mknode(-, p3,p,4) TECHNICAL PUBLICATIONS”. Anup trust for knowedgo Scanned with CamScanner anaes Semantic Analysis “| 22m miknadets, p Consider the string x*y-54z and let us draw the syntax tree e 7 5 7s enn = EL] Pointer to Pointer to symtab symtab for x fory Fig. 6.2.7 Syntax tree The syntax-directed definition for the above grammar is as given below. | | Production rule Semantic operation ee | ESET — Enptr=mknode(’+’,£,.nptr,T.nptr) EsE,*T Enpte =mknode(", 1 nptr,T.nptr) E> T ptr:=T.nptr Tid E.nptr:=mkleaf(id,id.ptr_entry) ‘T.nptre=mkleaf(num,num.val) se Scanned with CamScanner "Age, Eenptr Emyit Eenptr AN ee i ' t ontry for 1 z ' Hv entry for entry for x y Fig. 6.2.7 (a) Constructed syntax tree As we have seen that in the function calls the pointers to various nodes az Benerated. Such pointer are py, p2,P3 and so on. A synthesized attribute nptr for Ean T is used to keep track of these pointers for the nodes E and T. Thus we get npt, ptr_entry, val as synthesized attributes, Directed Acyclic Graph for Expression The directed acyclic graph usually refered as DAG is identifying the common subexpressions. Like the subexpressions in the expression. These operand2 where operands are the children of that a directed graph drawn t nodes have operator, operand! node. The difference between DAG an more than one parent and in 8 represented as duplicated subtree, Ez 1 J Draw the syntax tree and DAG for the expression (avb)+(c-d)(arb)+l Solution : This expression can be evaluated ag ((((a*d)+(c-d))x(aeb)) +b) The postorder traversal is abtcd—4ab*y TECHNICAL PUBLICATIONS. An up thrust or knowiodgo Scanned with CamScanner . ions i | id syntax tree is that common subexpressions ® yntax tree the common subexpression would * yntax tree DAG has nodes represent eonpte DOI 6.18 Somantic Analysis prom this postorder sequence the ayntay tree andl DAG ean by senate a fl an be yenerated as follows, “N AN LNA A be da pb Fig. 6.2.8 Syntax troo The sequence of operations for syntax tree is mkleai(ida) P= P2 = mkleaf(id,b) ps = mknode(*,p,,p2) > (arb) mkleaf(id,c) ps = mkleaf(id,d) Ps = mknode(-,py/ps) > (cd) Pr = mknode(+,p3,p4) ~ (arb)+(e-d) mkleaf(id,a) Py = mkleaf(id,b) Pip = mknode(*,p7,Ps) Puy = mknode(+;pp,piq) — ((arb}+(c-d))Marb) Piz = mkleaf(id,b) mknode(+,Pj1,P12) > (((aeb)+(c-d))*(arb)) +b) 2 " Ps = Fig, 6.2.9 DAG for (((a* b) + (e-d)+(aeb))) 4b TEGHHIGAL PUBLIGATIONS”- An up thnust for knowodgo Scanned with CamScanner . ax The sequence of operation for DAG is as fotos, Pr = mkleaf(id,a) P2 = mkleaf(id,b) P3 = mknode(*,p,,p2) > (a*b) Pa = mkleaf(id,c) Ps = mkleaf(id,d) Pe = mknode(-,pyps) > (c-d) P7 = mknode(+,p3,Pe) — (a*b)+(c-d) Ps = mknode(*,p;,p3) > ((atb)+(c-d))*(a"b) Po = mknode(+,ps,P2) Construct a syntax tree and DAG for Solution : +5. Ji™ ‘S, 7 7~\ k 5 k 5 (a) Syntax tree (b) DAG [23 Bottom-up Evaluation of S-Attributed Definitions We have already discussed how to use syntax-directed definitions to ee translations. Now in this section we will discuss how to implement syntax translation scheme for the syntax-directed definitions. Hence a translator is bg task of building translator for any arbitrary syntax-directed definition is very However, to accomplish this task there are large classes of syntax-directed definitio which it is easy to construct translators, a vit S-attributed definition is one such class of syntax-directed definite” synthesized attributes only. Synthesized attributes can be evaluated using the bottom-up parser: aot zea ac The purpose of stack is to keep track of values of the synthesize xa" associated with the grammar symbol on its stack. This stack is comm" as parser stack. TECHNICAL PUBLICATIONS”. An up thrust for knowledge Scanned with CamScariner Somantic Analysie pe synthesized Attributes on the Parser Stack 1, A translator for S-attributed definition is implemented using, LR parser generator. 2 A bottom up method is used to parse the input string, 3. A parser ed to hold the values of synthesized attribute, The stack is implemented as a pair of state and value. Each state entry is the pointer to the LR (1) parsing table. There is no need to store the grammar symbol implicitly in the parser stack at the state entry. But for ease of understanding, we will refer the state by unique grammar symbol that is been placed in the parser stack, Hence parser stack can be denoted as stack{iJ. And stack[i] is a combination of state[i] and valueli].For example, for the production rule X > ABC the stack can be as shown in Fig. 6.3.1. Production : X > ABC Value State Value aa x Xx top, B Bb = top—]| C Coc Before reduction After reduction Fig, 6.3.1 Parser stack The top symbol on the stack is pointed by pointer top. Production rule Semantic action X > ABC Xx = f(A.aBbCo) Before reduction the states A, B and C can be inserted in the stack along with the values A.a, B.b and C.c.The top pointer of value[top] will point the value Cc, similarly Bb is in value[top - 1] and A.a is in value[top ~ 2].After reduction the left hand side symbol of the production ie. X will be placed in the stack along with the value X.x at the top. Hence after reduction value[top] = X.x. 4. After reduction lop is decremented by 2 the state covering X is placed at the top of stateltop] and value of synthesized attribute X.x is put in valueltop]. 5-If the symbol has no attribute then the corresponding entry in the value array will be kept undefined. Scanned with CamScanner san sign ott (CEELLELY cor ine fottaoing given grammar construct Me te dfn gy generate the cone fragment (ranlator) using S-attribaded definition, + EN Bane RET Eat Ty Tah To Tf rok P(t) P digit N~>; Also evaluate the inpul string 20304; with parser slack wing LR parsing method, Solution : he syntax-directed definition for the given grammar can be written as follow Production rule Semantic actions 53EN Print(P.val) Bobet E.val=Byvale'Toval repo Ba yvaleTval ROT Eval: val Po TF Ty.val X P.val Tots val/Pval Tor F- (L) digit lexval Can be ignored by lexical analyzer ils terminating, symbol, can be generated, (W ‘ated, (We hy, "Beneralor ty chapter 5) © The LR parser table LR parse ave already discussed the meth a, PUALICATIONS™» An up tus for knowieit sSeanined with CamScanner Hs, po _——— 6-19 Somanie anys on se the attributes the code fragment can be generated by using the parser yaluate ‘ . oe the appropriate reduction of each production and corresponding code fragment is as given below. Code fragment Print(value|top]) valueltop]:=valueltop-2]*value{top] alue[top-2]/valueltop] + The sequence of moves made by the parser for the input 23+4; are as given below. Value Production rule used| “Scanned with CamScanner On seeing the first input symb' the state stacl II be implemented an the parser shifts F in Eval = digitexval wil move parser reduces by the value{top] and state| of the input string is di state{top] = S the start sate. In this way the bottom-up evaluation of S-attril [top] is left unchanged. Continuing in S7 EN ‘ol 2, initially the symbol 2 is recognized as digit ant k. F corresponding to digit and the semantic actna d the value[top] becomes = 2. In the net tis associated with this produetin this fashion the evaluation nit reaches to TF. As no code fragmen one and the parser halts successfully whe buted definitions is done. attributed grammar to convert the given grammar swith inf operators to prefix operators. LoEESE+T, EDE-LET, ToT+E TOF, F3FTP,F3P, P(E), Pid Solution : The grammar rules having synthesized attributes only, is call for converting infix operators to prefix is given Production rule TF nye ro th the seman such gram ute: that contains all the syntactic rules along wi led s - attributed grammar. by using the ‘val’ as S - attrib TECHNICAL PUBLICATIONS” An up thrust for knowiedge ae Scanned with CamScanner

You might also like