You are on page 1of 116
ELS A ne a_ 3 &~=7~« (py* Action is the foundational key to all Success. Compiler Design © POINTS T0 REMEMBER ‘© EXPECTED QUESTIONS CHAPTERWISE ‘© LORDS MODEL TEST PAPERS (UNSOLVED) B.E./B.TECH (P.T.U.) 6th Semester (BTCS-601) (Computer Science Engg.) Dr. Pooja Chopra g to the Latest Syllabus Prescribed by Accordin Punjab Technical University re 2. 2 ~ Syllabus ~ |. INTRODUCTION TO COMPILERS Structure of a compiler ~ Lexical Analysis - Role of Lexical Analyzer — Input Buffering — Specification of Tokens — Recognition of Tokens - Lex — Finite Automata ~ Regular Expressions to Automata — Minimizing DFA. SYNTAX ANALYSIS Role of Parser — Grammars — Error Handling - Context-free grammars — Writing 2 grammar, Top-Down Parsing - General Strategies Recursive Descent Parser — Predictive Parser-LL(1) Parser-Shift Reduce Parser-LR Parser-LR (0) Item Construction of SLR Parsing Table - Introduction to LALR Parser - Error Handling and Recovery in Syntax Analyzer-YACC. . INTERMEDIATE CODE GENERATION valuation Orders for Syntax Directed Definitions, Syntax Directed Definitions, Ey yi Tree, Three Address Code, Types and Declarations, Intermediate Languages: Syntax Translation of Expressions, Type Checking. RUN-TIME ENVIRONMENT AND CODE GENERATION Storage Organization, Stack Allocation Space, Access to Non-local Data on the Stack, Heap Management — Issues in Code Generation - Design of a simple Code Generator. CODE OPTIMIZATION Principal Sources of Optimization - Peep-hole optimization - DAG- Optimization of Basic Blocks-Global Data Flow Analysis - Efficient Data Flow Algorithm. Artifi eh eee TT, a ~ Content ~ Introduction to Compiler Syntax Analysis Intermediate Code Generation Run Time Environment Code Generation & Code Optimization Model Test Papers 534 35-62 63-78 79-87 88-118 119-120 isl go) Unit Introduction to Compiler Cee Structure of a cot ~ Lexi q mpiler — Lexical Analysis — Role of Lexical Analyzer - Input Buffering — Specification of Tokens ~ Recognition of Tokens — Lex ~ Finite Automata — Regular Expressions to Automata — Minimizing DFA POINTS TO REMEMBER fé i 1. Compiler : Program that converts instructions into a machine code so that they can be read and executed by computer. 2. Compiler compiler : Systems that help in compiler writing process. 3. Lexical analyzer : Process of converting a sequence of characters into sequence of tokens. 4. Syntax analyzer : Syntax analyzer will just create a parse tree and it checks the machine of the string parsed. 5. Semantic analyzer : Process to gather necessary semantic information from the source code. 6. Intermediate code generation : Receives input from semantic analyzer and convert into a linear representation. 7. Code Optimizer : Method to improve code quality and efficiency. 8. Code generation : Object code of some lower level programming language. 8. Structure editor ; Analyzes the program text putting an appropriate hierarchal structure on the source program. 10. Loader :, Perform loading and link editing. 11. Preproprocessor : Tool that produces input for compliers, 12. Symbol table : Data structure maintained to store information about the occurrence of various entities. 13. Assembler : Program to convert source language into assembly language. 14, Lexeme : Basic abstract unit of meaning. 15. Sentinel : A Special character that can not be part of the source program. Recognizer : Machines that accepts string belonging to a language, 17, YACC : Yet another compiler complies. 18, Pare ; Pars refers to the traversal of compiler through the entire program, 5 - . LORDS Compiler Design . Bootstrapping : Process of implementing a compiler in the language that it is Supy Posed to compile. 20. Buffer pair : characters into moving characters. QUESTION-ANSWERS Q 1, What is compiler ? Ans. A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language - the target language. The compiler reports to its users the presence of errors in the source program, Q 2. What are the two parts of compilation ? Explain briefly. Ans. Analysis and synthesis are the two parts of compilation. 1. The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. 2. The synthesis part constructs the desired target program from the intermediate representation. Buffering technique to reduce the overhead required to process an put Q 3. Depict diagrammatically how a language is processed. Ans. Skeletal source program 4 Preprocessor 4 Source Program + Compiler 4 Target assembly program 4 Assembler v Relocatable machine code 4 Loader/Link editor-Library, relocatable object files L An Absolute machine code . List the various phases of a compiler. Ans. The followin, ji a 9 are the vari ‘ler : 1, Lexical Analyzer ious phases of a compiler : yptroduction to Compiler 2. Syntax Analyzer 3. Semantic Analyzer 4. Intermediate Code 5. Code optimizer 6. Code generator. Q 5. Mention some of the cousins of Ans. Cousins of the compiler are = Q Preprocessor O Assemblers O Loader and Link Editors Q 6. Define compiler compiler, Ans. Systems to help with the compiler-writing process are often been referred to as compilers-compilers, compiler-generator or translator writing systems. Largely they are oriented around a particular model of languages, and they are suitable for generating compilers of language similar model. Q 7. List the phases that constitute the front end of an compiler. Ans. The front end consists of those phases or parts of phases that depend primarily on the source language and are largerly independent of the target machine. These include : O Lexical and syntactic analysis O The creation of symbol table 1D Semantic analysis O Generation of intermediate code. A certain amount of code optimization can be done by the front end as well. Also includes error handling that goes along with each of these phases. @ 8. Mention the back-end phases of a compiler. Ans. The back end of the compiler includes these portions that depend on the target machine and generally those portions do not depend on the source language, just the intermediate language. These include. 1. Code optimization 2. Code generation, a Q 9, What is compile Ans. Complier : Fefer to Q.No.1 ; Phases of a Compiler : A compiler operates in phases. A phase is a logically interrelated operation that takes source program in one representation and produces output in another phases of a compiler are shown as below. There are two phases of generator a compiler. Jong with error handling and symbol-table operations. 1 ? Explain various phases of a compiler ? fepresentation. The Compilation. {a) Analysis (Machine Independent/Language Dependent) (b) Synthesis (Machine DependenlLanguage Independent) Complation process is partoned into no-of-sub processes called ‘phases! (a) Lexical Analysis : LA or Scanners read the source program one oh time, converting the source program into a sequence of automic units called t ‘a racter at a lokens, OO CC 8 LORDS Compiler Design Source program Lexical analyzer T ‘Syntax analyzer f ‘Semantic, , | analyzer, «9 |] ~——_ Symbol I Error table manager handler *——___ [intermediate |__ code generator q Code optimizer q ‘Code generator Target program Phases of Compiler (b) Syntax Analysis : Syntax analysis is the second phase of compilation process. I takes tokens as input and generates a phase tree as output. In syntax analysis phase, the parser checks that the expression made by the tokens is syntactically correct or not. (c) Semantic Analysis : Semantic analysis is the third phase of compilation process. tt checks whether the parse tree follows the rules of language. Semantic analyzer keeps tar of identifies, their types and expressions. The output of semantic analysis phase is the annotated tree syntax. (d) Intermediate code generation : In the intermediate code generation compl" generates the source code into the intermediate code. Intermediate code is generated Lee the high-level language and the machine language. The intermediate code should be gener" in such a way that you can easily translate it into a target machine code. oa (e) Code Optimization : Code Optimization is an optional phase. It is used !0 he the intermediate code so that the output of the program could run faster and ue cs space. It removes the unnecessary lines of the code and arranges the sequence of state! in order to speed up the program execution. is ee aes : Code generation is the final stage of the © ed intermediate code as input and maps it to the targel Code generator translates the intermediate code into the machine code ° computer, A compilation pr0eess ne langues t machine coed iad introduction to Compiler 9 Example : Sum : = Old sum + Rate * 50 * L Lexical Analyzer L id1 =id 2+ id 3° id 4 ‘Syntax Analyzer v idf 2 ~ id3 id 4 ‘Semantic Analyzer v YN, WAL ia 50 int to real J Intermediate code generator ‘ Y Code optimization v temp 1:= id3 * 50.0 idt : = id2 + temp 1 LORDS Co y mpiler Design Code generation L MOV F id3, R2 MUL F # 50.0, R2 MOVE id2. R2 ADDF R2, Rt MOVF Rit, idt @ 10. What is the role of the lexical analyzer ? ‘Ans. Lexical analyzer perform below given tasks : Helps to identify token into the symbol table. Removes white spaces and comments from the source program. lates error messages with the source program. jound in the source program. O Corel Helps you to expand the macros if itis f G Read input character from the source program. @ 11, State some software tools that manipulate source program. Ans. (i) Structure editors (il) Pretty printers (ii) Static (iv) Checkers (v) Interpreters Q 12. What are the main two parts of com) Ans. The two main parts are : ( Analysis part breaks up the source progral an intermediate representation of the source program. Q Synthesis part constructs the desired target program representation. Q 13. What is structure editor ? Ans. A structure editor takes as input a sequence of commands to buil The structure editor not only performs the text creation and modification func! text editor but it also analyzes the program text putting an appropriate hierarchical st on the source program. Q 14, What are a pretty printer and static checker ? tie ae Pretty printer : A pretty printer analyses a program and prints it in suc! eae of the program becomes clearly visible. 4 potential buge tht static checker reads a program, analysis it and attempts 10 08°" Q 15. How mai running the program. Ans. Analysis eae foes analyalg éonsist 7 a ists of three phases : (i) Linear analysis pilation ? What are they performing ? m into constituent pieces and creates from the intermediate id a source program: tion of an ordinary ycture a way thal my introduction to Compiler (ii) Hierarchical analysis : (ill) Semantic analysis Q 16. What happens in linear analysis ? Ans. This is the phase in whi in which th ir ig read ftom left to right and coe . el of characters making up the source program ‘oleotive meaning. into tokens that are sequences of characters having : oe Peeeat in hierarchical analysis 2 ns. This is the phase in which ch : . aract ically i sexed collections with collective mesning eters or tokens are grouped hierarchically into Q ee What happens in semantic analysis ? Ans. This is the phase in which certain checks are performed to ensure that the components of a program fit together meaningfully. Q 19. State some compiler construction tools ? ‘Ans. (i) Parse generator (ii) Scanner generators (ii) Syntax-directed translation engines (iv) Automatic code generator > (v) Data flow engines Q 20. What is a loader ? What does the loading process do ? Ans. A loader is a program that performs the two functions : (i) Loading (i) Link editing The process of loading consists of taking relocatable machine code, altering the telocatable address and placing the altered instructions and data on memory at the proper locations. Q 21, What does the link editing does ? to make a single program from several files of relocatable ‘Ans. Link editing allows us machine code. These files may have been the result of several. Compitations and one or more may be library files of routines provided by the system and available to any program that needs them. Q 22. What is a preprocesso” / ‘Ans. A preprocessor is on? which produces input to compilers. A source program may be divided into modules stored in separate files. The task of collecting the source program is sometimes entrusted 10 4 distinct program called a preprocessor. The preprocessor may also expand macros into source language statements, ‘ ssor ? Skeletal source program Preprocessor Source program i LOADS Compiler Design Q 23. State some functions of preprocessors. Ans. (i) Macro processing (ji) File inclusion (iii) Relational preprocessors (iv) Language extensions Q 24, What is a symbol table ? ‘Ans. A symbol table is a data structure containing a record for each identifier, with fields for the attributes of the identifier. The data structures allows us to find the record for each identifier quickly and to store or retrieve data from that record quickly. @ 25. State the general phases of a compiler. Ans. (i) Lexical analysis i) Syntax analysis Semantic analysis (iv) Intermediate code generation (v) Code optimization (vi) Code generation Q 26. What is an assembler ? ‘Ans. Assembler is a program, which co nverts the source language into assemb! language. Q 27. What is the need for separating the a parsing ? nalysis phase into lexical analysis and or What are the issues of lexical analyzer ? Ans. 1. Simpler design is perhaps the most important lexical analysis from syntax analysis often allows us to simplify one or the other phases. 2. Compiler efficiency is improved. 3. Compiler portability is enhanced. Q 28. What Is lexical analysis ? Ans. The first phase of compiler is lexical analysis. This is also kn in which the stream of characters making up the source program is read from left to right and grouped into tokens that are sequences of characters having a collective meaning. Q 29. What Is lexeme ? Define a regular set. ‘Ans. A lexeme is a sequence of characters in the source program that is matched by th? pattern for a token. A language denoted by a regular expression is said to be a regular set Q 30. What is sentinel ? What is its usage ? io oe Ca ake characters that cannot be part of source @31, What ls te a is is used for speeding up the lexical analyser. regulét éupression? gular expression ? State the rules, which defin Ans. ion is Regular expression is a method to describe regular language. t consideration. The separation of of these own as linear analys® program. Normal _aatt sntroduction to Compiler Rules 1. €-is a regut “ “aah gular expressi string. on that denotes {¢} that is the set containing the empty 2. If ais a symbol in %, then a i 3. Suppose r and s are = ‘ar Gee Then (a) (1)(s) is a re i ve) fo a a expression denoting L(t) U L(s). fj gular expression denoting L(r) Li (0) (9 "is a requar expression denote tae si pe S a regular expression arnGa ne RiGheTeee actions in a lexical analyzer ? (1) Deleting an extraneous ee ee (2) Inserting a missing character. (3) Replacing an incorrect character by a correct character. (4) Transposing two adjacent characters. Q 33. Construct Regular expression for the language. L = {W ¢ {a, b3/W ends in abb} Ans. {a/b} * abb. Q 34, What is recognizer ? ‘Ans. Recognizers are machines. These are the machines which accept that strings belonging to certain language. If the valid strings of such language are accepted by the machine then it is said that the corresponding language is accepted by that machine, otherwise itis rejected. Q.35. What are the vari ‘Ans. The compiler construc! for helping to implement various pl construction tools. / 1. Parser Generators : These produce syntax analyzers, normally from input that is based on a context free grammer. - : tion of the running time of a compiler. it consumes a large frac u " Q Example : YACC (Yet another compiler compiler) 2, Scanner Generators * a (2 These generate lexical analyzers, normally from a specification based on regular expression. Q The basic orga! 3. Syntax directed Q These produc intermediate code Q Each translation 16 tree. ‘gular expression that de an ’ notes {a}. ular expressions denoting the language L(r) and L(s). ous compiler construction tools ? tion tools are the specialized tools and have been developed hases of a compiler. The following are the compiler nization of lexical analyzers is based on finite automation. Translation e routines # hat walk the parse free and as a result generate defined in terms of translations at its neighbour nodes in the D>. te “a LORD> Compiler Design 4. Automatic code generators : Q Ittakes acollection of rules to translate intermediate language into machine langua GQ. The rules must include sufficient details to handle different possible access rmathon for data. 5. Data flow Engines : _Itdoes code optimization using data flow analysis, that is, the gathering of informatio, about how values are transmitted from one part of a program to each other part Q 36. Write the steps to convert Non-Deterministic Finite Automata (NDFA) into deterministic Finite Automata (DFA). ‘Ans. Problem Statement : Let X = (Qy, E, 8 do, F,) be an NDFA which accepts the language L(x). We have to design an equivalence DFA. Y = (Qy, 2, 8 dos F,) such L(y) = L(x). The following procedure converts the NDFA to its equivalent DFA : Algorithm : Input : An NDFA. Output : An equivalent DFA. Step 1. Create state table from the given NDFA. Step 2. Create a blank state table under possible input alphabets for the equivalent DFA. Step 3. Mark the start state of the DFA by qo (Same as the NDFA) Step 4. Find out the combination of states (Qo, Qy, «+» Qn) for each possible inpu alphabet. Step 5. Each time we generate a new have to apply step 4 again, otherwise go to step 6. Step 6. The states which contain any of the final states of NDI the equivalent DFA. Example : Let us consider the NDFA shown in the figure below : DFA state under the input alphabet columns, we FA are the final states 0 q &(q, 0) &(q, 1) a {a, b, c, d, e} {d, e} b 0) {e} Le o {b} 2 {e} 6 e i; ; ‘yatroduction to Compiler 15 Using the above algorithm, w find its equivalent DFA. The state table of the DFA is shown in below t a ial a ) a1) {a,b, od, a, b,c, do} Tel {de} " a 4,9) Ib, 4, 0} {b, d, e} [eve] ¢ {e] é le] ae ‘ ti (o i a |e mI Q 37. Write a short note on : (a) YACC (b) Pass {c) Bootstrapping (d) LEX Compiler (e) Tokens, Patterns and Lexemes ‘Ans. (a) YACC : YAGC stands for yet another compiler compiler. YACC provides a tool to produce a parser for a given grammar. YACC is a program designed to compile a LALR (1) grammar, It is used to produce the source code of the syntactic analyzer of the language produced by LALR (1) grammar. The input of YACC is the rule or grammar and the output is 8 C program. (b) Pass : Pass IS a complete traversal of the source program. Compiler has twc lotaverse the source progam. 0 passes re tne apping : Bootsteppig widely used inthe compl ey saaed to produce a sell hosting compiler. Seif Tosti ae development. Care Veompile is own souce code. Bootstep compile i y pller is a type of cer that a yu an use tis comlled completo comple eve sed fo compile the ‘ture versions of itsell ty thing else as well as i ; LORDS Compiler Design (@ LEX compiler : Lex is a program that generates eapee eee YACC parser generator. The lexical analyzer is a program that transforms an input stream jnto a sequence of tokens. It reads the input stream and produces the source code as output through implementing the lexical analyzer in the C program. (e) Token, Patterns and Lexemes : (i) Token is a sequence of characters that can be treated as a single logical entity. Typical tokens are (1) Identifiers (2) Keywords (3) Operators (4) Special Symbols (5) Constants, (li) Patterns : A set of strings in the input for which the same token is produced as output. This set of strings is described by a tule called a pattern associated with the token, (iii) Lexeme = A lexeme is a sequence of characters in the source program that is matched by the pattern for a token. @ 38. Convert the following Non-Deterministic Finits Automata (NFA) to Deterministic Finite Automata (DFA) ab Solution ; Transition table for the given Non-Deterministic Finite Automata (NFA) is : State/Alphabet a b 40 qo Gor a j “de “de fe a Step 1; Let 6' bea new set of states of the Deterministic Finite Automata (DFA) Let T' be a new transition table of the DFA. Step 2 : Add transitions of start state dg to the transition table T’. State/Alphabet a b 40 o {Go ai} Step 3 : New state present in state 6" is {qo, 41} Add transitions for set of states {Go at to the transition table T'. : State/Alphabet a b 40 qo {Go, a1} {Go. a1} qo {do, 1» Ga} Step 4 : New state present in state 6' is {Qo, 41. G2} Add transitions for set of states {40 1, Qa} to the transition table T* State/Alphabet a b a i do {do. a1} lor 1) Go {Go 91 G2} {Gos 41,42} Go Go. 41. 4) Poe t t | { | | } | lates containing qp as its component ar jntroduction to Compiler Step 5 : Since no new states are State/Alphabet a b > a a 0 {Qo ai} . lov Gh % “(Qo G1» Ge} lor 1-92) a) “0 G1 Ged | Now, Deterministic Finite Automata (OFA). may be drawn as Deterministic Finite Automata (DFA) Q 39. Convert the following Non-Deterministric Fi Deterministric Finite Automata (DFA) Solution : Transition table fo State/Aiphabet__| 2 a 0 qo gn ‘de a Qn “Ge ‘Ge “de or 1 4 Step 1 : Let 6’ be anew $e! tea ‘tion table of the DFA. oe Se aad a rsitions of start stale d 0 the transition table T' E 1 State/Aipnabet_| © do {qt G2) 0 Step 3 : New stale prest “U1 the transition table T- 17 left to be added in the transition table T’, so wo step | sgble for Deterministic Finite Automata pee final states of the DFA. Finally, Transition nite Automata (NFA) to + the given Non-Deterministic Finite Automata (NFA) is 1{ of states of the Deterministric Finite Automata (DFA). Let T’ ent in state @', is {q;, 2} Add transistions for set of states {q. 1» 18 LORDS Compiler Design State/Alphabet 0 1 0 qo {a1 G2} {a1» G2) {Qo. G1» Gab {a1 G2} Step 4: New state present in state 6' is (Jo. 41 G2} ‘Add transitions for set of states {q, Gy. Qa} to the transition table T’ State/Alphabet 0 7 0 qo {a1 da) {a1» Gab {a0, 41 G2 | {41 Geb {do, G1» Ga {do. 1» a} {a1 Geb e transition table T’ so we stop tates are left to be added in th re treated as final states of the DFA. Finite Automata (DFA) is : Step 5 : Since no new s States containing dz as i Finally, Transition table f its component ai for Deterministic ‘State/Alphabet o 1 > do qo *{a1, deb *{a1. da} “(do Gs Ged | “(Av G2} *{do. G1» Gab *{o. I» Geb *{a1, Fa) ye drawn as Now Deterministic Finite Automates (DFA) may b 0 Deterministic Finite Automata (DFA) wing Non-Deterministic Finite Automata (NFA)! Q 40. Convert the follo Deterministic Finite Automata (DFA) Solution : Transition table for the given Non-Deterministic Finite ‘Automata nea) ® State/Alphabet a D 7% “di de = a . : Se “an Ge qe Introduction to Compiler Step 1: Let q' bea i. new set ee be a new transition table of the py a Stales of the Deterministic Finite Automata (DFA). Let T’ Step 2 : Add transition: 'S of start state qg to th it transi : Talefhiphene ~ do 18 transition table T’ b = qo {41 92) _[¢(Dead state) Step 3 : New state pre: it is Pe ri nn Present in state 0" is {q;, qa) Add transitions for set of states {q,, 42) State/Alphabet q > 40 {a1 G2) $ {av a2) {91 dad a i sep 4: New state present in state 0° is qa. Add transitions for state qa to the transition table T’ State/Alphabet a b 490 {G1 G2) 4 {a1 da} {a1 G2} C7 Ge {a1 2b {% Step 5 : Add transitions for dead state (9) to the transition table T’ State/Alphabet a o 40 {a1 G2) o {a1» Ga} {a1 G2} 2 Co (a1 a} [% é 4 4 be added in the transition table T' so we stop tates are left to ponent are treated as final states of the DFA. Finite Automata (DFA) is : Step 6 : Since no new s| states containing qy as its com fe tre Transition table for Deterministic Finally, State/Alphabet a b 40 “(a Ge) $ {ay G2} *{d1» Qa} qe a “{a1» G2} G2 4 $ > Now, Deterministic Finite Automata (DFA) may be drawn as : : a an Deadsiate Deterministic Finite Automata (DFA) no LORDS Compiler Design @ 41. What is input buffering 2 Ans. To ensure that a right lexeme is found, one or more characters have to be lookeg up beyond the next lexeme. Hece @ two buffer scheme is introduced to handle large look aheads safely. Technique for speeding up the process of lexical analyzer such as the use of sentinels to mark to buffer and have been adopted. There are three general approaches for the implementation of lexical analyzer : 4. By using a lexical analyzer generator : In this, the generator provides routines for reading and buffering the input. 2. By writing the lexical analyzer in a conventional s using /O facilities of that language to read the input. 3. By writing the lexical analyzer in assembly language an reading of input. Q 42, What is Buffer pairs ? Ans. A specialized buffering techniques 0 used to reduce the amount of overhead, which is required to process an input character in moving characters. ep LP tel lai L ystems: programming language, \d explicitely managing the tT | Begin Forward Consists of two buffers, each consists of N character size, which are reloaded Oo alternatively. Two pointers lexeme Begin and forward are maintained. Lexeme Begin points to the beginning of the current lexeme which is y found. Forward scans ahead until a match for a pattern is found. Once a lexeme is found, lexeme begin is set to the character immediately after th lexeme which is just found and forward is set to the character at its right end Q Current lexeme isthe set of characters between two pointers. Q 43, What is sentinels ? How it Is used in input buffering. ‘Ans. Sentinels is used to make a check, each time when the forward pointe! a check is done to ensure that one half of the buffer has not moved off. It is done, other half must be reloaded. “myoneiaiee the ends of the buffer halves require two tests for each Test 1: For end of buffer eH ee 10 what character is read. The usage of se! ak 0 i ing each buffer half to hold a sentinel character at the ©! a special character that cannot be part of the source program. at to be oo oo is moved then th advance of tinel reduces 'M? ba nd. Introduction to Compiler Q 44. Find regular xpression for th, . F the following DFA. _ Solution : Step1 ; state qj. The resulting DFA is : o OC Step 2: Final state B has an outgoi joing edge. Sc i Fe ne Igoing edge. So, we create a new final state qj. OH-~G H+O Step 3 : Now, we start eliminating the intermediate states First, let us eliminate state A. Q There is a path going from state q; to state B via state A © So, after eliminating state A, we put a direct path from state q; to state B having Lost £.0 =0 Q There is a loop on state B using state A. 0 So, affer eliminating state A, we put a direct loop on state B having cost 1.0 = 10. Eliminating state A, we get a. 6+ B-+8 Step 4 : Now, Let us eliminate state B 0 There is a path going from state qito state gf via state B. © So. after efminating state B, we puta direct path from state q to state q having cost 0 (10)*.E = 0(10)* Eliminating state B, we get () orto" © Initial sta it te A has an incoming edge. So, we create a new initial From here, Regular expression = 0(10)" » LOADS Compiler Design Q@ 45. Find regular expression for the following DFA. Solution. new single final state. Step 1. There exist multi The resulting DFA is iple final states. So, we create a we start eliminating the intermediate states - tate B from state A to state a we put a direct pal Step 2. Now, Firts, let us eliminate s! There is a parth going So, after eliminating state 8, a.a’.E = aa". Eliminate state B, we get via state B th from state A to state qy having cost one 6. Now, let us eliminate state C. a Ss chee a path going from state A to state gy via state C. b. eee state C, we put a direct path from state A t to state having ct - ntroduction to ‘Compiler Eliminating state C, we get 23 From here, Regular expression = aa* + ba’, Q 46. Differentiate between compiler and interpreter. Ans. Compiler produces a target program whereas an interpreter performs the operations implied by the source program. Q 47. Write short notes on buffer pair. ‘Ans. Concerns with efficiency issues. It is used with a look ahead on the input. It is @ specialized buffering technique used to reduce the overhead required to process an input character. Buffer is divided into two N-character halves. Use two pointers. Used at times when the lexical analyzer needs to look ahead several characters beyond the lexeme for a pattern before a match is announced. Q 48. Differentiate between tokens, patterns, lexeme. Ans. Difference between tokens, patterns, lexeme : © Tokens : Sequence of characters that have a collective meaning. © Patterns : There is a set of strings in the input for which the same token produced as output. This set of strings is described by a tule called a pattern associated with the token. Q Lexeme : A sequence of chat patterns for a token. Q.49, List the operations on languages. ‘Ans. The following are the operations that can be applied on languages. Union : LUM = {S/S is in L or Sis in M) © Coneatenation : LM = (St/S is in L andt is in M} | 0 Kleene Closure : L* (Zero oF more concatenations of L) eure : L+ (One of more concatenations of L) Q Positive clo Q 50, Write a regular expression for an identifier. Ans, An identifier is defined as a letter followed by zero or more letters or digits. The Muar expression for an identifi is given as llr (lelter/digit) acters in the source program that is matched by the L SS 7 LORDS Compiler Design e various notational short hands for representing regula, It Q 51. Mention th expressions. ‘Ans. The various notational follows : One or more instances (+) Zero or one instance (2) Character classes ({abc] where a, b, care expression alb/c.) Non regular sets Qa Q 52. What is the function of a a ‘Ans. Hierarchical analysis is one in which the tokens are grouped hierarchically into ted collections with collective meaning. Also termed as parsing. Q 53. What does a semantic analysis do ? ‘Ans. Semantic analysis is one in which certain checks are performed to ensure that components of a program fit together meaning fully. Mainly performs type checking. Q 54. What are the roles and tasks of a lexical analyzer ? | shorthands for representing regular expressions are as alphabet symbols that denotes the regular heirarchical analysis ? nest Tome Hate 5 Source} analyze arser+ | | _+Synitax iogem analyzer Semantic analyzer fe — eeaiaee ‘manager ——=— Ans. Main task : Take a token sequence from the scanner and verify that itis & syntactically correct program. Secondary Task : 0 Process declarations and set up symbol table information accordingly, in preparalict for semantic analysis. © Construct a syntax tree in preparation for intermediate code generation. Q55. How a regular expression converted into a Deterministic finite ‘Automation Ans. The task of a scanner generator, such as J Lex, is to generate the transi’ ti : see eee eae the scanner program given a scanner specification (in the form ° : ). So its needs to convert REs into a single DFA. This is accomplished in {wo sich werts PI = first it converts REs into a non-deterministic fini ince deterministic finite automation (NFA) and then it con An NFAis simi i i ] and Sebati ovat toa DFA but it also permits multiple transitions over the same cna and pops cvs. the case of multiple transitions from a state over the same nara“ ek ha vine State and we read this character, we have more than one choice, the f one of these choices succeeds. The transition doesn't consume a" ie Introduction to Compiler 25 characters, SO you may j pit ft tums out that Be fala another state for free. Clearly DFAs are a subset of NFA‘ fifen converting a NEA toa DFA ‘As have the same expressive power. The protien is i We will first learn how to ae may get an exponential blow up in the number of states, BijZa" one for each type of RE. @ RE into a NFA. This is the easy part. There are only 5 J e AB Als a = .s construct NFAs with only one final t, to construct the NFA for the RE, AB, we sented as two boxes with one start state and ‘onstructed by connecting the final state shown induotively the above rule the third rule indicates that d B, which are repre Then the NFA for AB is c n emply transition. e following NFA : As it can been state, For example, ‘construct the NFAS for A ani ‘one final state for each box. of A to the start state by 8 using 4! For example, the RE (a/b) ¢ is mapped to th ra NFA to a DFA (called subset construction). Suppose that tate. The OFA states generated by subset construction 0 conve! ‘ach NFA s! d of just one number. For example, a DFA state may have be en _ The next step ist you assign a number to & have sets of numbers: I! it sets of nt g). This indicates that arrivirig to the state labelled {5, 6, 8} in the DFA assigned the set (5 6 Tithe same as ariving 10M? sate 5, the state 6, or the state 8 in the NFA when parsing the insteat 7 LORDS Compiler Design same input. First we need to handle transitions that lead to other states for free without consuming any input. These are the « transitions. We define the closure of a NFA node as the set of all nodes reachable by this node using zero one, or more « transitions. For example, the closure of node 1 in the left figure below : is the set (1, 2}. The start state of the constructed DFA is labelled by the closure of the NFA start state. For every DFA state labeled by some set {S}, ....., Sq} and for every character C in the language alphabet, you find all the states reachable by Sy, Sp, ...... or S, using C arrows and you union together the closures of these nodes. If this set is not the label of any other node in the DFA constructed so far, you create a new DFA node with this label. For example, node {1, 2} in the DFA above has an arrow to a (3, 4, 5} for the character a since the NFA node 3 can be reached by 1 on a and nodes 4 and 5 can be reached by 2. The b arrow for node {1, 2} goes to the error node which is associated with an empty set of NFA node. The following NFA (alb)* (abbja*b), even though it wasn’t constructed with the above RE — to - NFA rules. It has the following DFA. Oa Q 56. What are lexical errors ? Ans, A character sequence which is not possible to scan into any valid token is a lexical error. Important facts about the lexical error are as follows : 1. Lexical errors are not very common, but it should be managed by a scanner. E eee of identifiers, operators, keyword are considered as lexical errors. . Generally a lexical error is caused by the aj aie ppearance of some illegal character, mostly at the beginning of a token, $ Q 57. Explain the advantages and di: 1 isadvantages ANA Advantages : ges of lexical analysis. 1. Lexical analyzer method i is used by programs like compilers which can use the Parsed data from a programmer’ ale 9 'er's code to create a compiled binary executable wy introduction to Compiler 27 2. Itis used by web by 10) ; ‘wsers to format and display a web page with the help of parsed / ree from javascript, HTML, CSS 3. Ase i ” 7 arate lexical analyzer helps you to construct a specialized and potential more efficient processor for the task. Pi and potentially Disadvantages of Lexical analysis : 7 \ i 1. You need to spend significant time eee pt : significant time reading the source program and partitioning it in 2. Some regular expressio i ns are quite diffic EBNF rules. quite difficult to understand compared to PEG or . Mere effort is needed to develop and debug the lexer and its token descriptions. . Additional runtime overhead is required to generate the lexer tables and construct the tokens. ¢ 58. What is the difference between lexical analyzer vs Parser. ns. Lexical Analyzer 7, Scan input program. 2. Identify tokens. Parser 1. Perform syntax analysis. 2. Create an abstract representation of the code. Update symbol table entries 4, It generates a parse tree of the source s 3, Insert tokens into symbol table. 4, It generates lexical errors. code. @ 59, What are token attributes 7 ‘Ans. During the parsing stage, the compiler will only be concerned with token. Any integer constant for example is treated like any other. But during later processing, it will certainly be important just which constant was written 'a token that can have many associated lexemes has an attribute, To deal with that, Which can be the lexeme if you like. During semantic processing, the compiler examines the token attributes. An attributes is not always a lexeme. For example the attribute for a TOK~ INTCONST token might be an integer, telling the number that was written. © 60. How are lexical analysis typically defined ? Ans. Finite state machines = The standard tool for defining a lexical analyzer is a.finite slate machine. ie toner stars in an inal state. As it reads each character, it follows transitions that take sit to other states. The states and transitions can handle the entire process of lexical analysis, At the initial state, there will usually be a transition labelled by a space that point: back to the start state, so that the lexer will skip over initial spaces. At the initial state, th a is might have a transition labelled 'f that can be used to handle lexemes for, from and id le lexer such as frog, There is no need fora separate check for each word, | identifiers the texer reaches them and it cannot move to another state, then the lex. s indicate that, ‘er should produce 4 particular token. For example, reading characters ‘’ and ‘' mi ° ight take it to a stat le where it / | / = ao LORDS Compiler p, will pi oauce nn ‘Ok-IF, but only if no more characters can be read. The abe uses a variable to keep track of the stat le fini mi sate : Nek le and follows the finite state mac! 28 Code for a lox, hine in the mos, Regular expre: : P . describing sets ne on : Regular expressions are a simple and compact notation f, The fundamental operation of regular expression are : Q Union (A|B) Q_ Concatenation (AB) Q Star (Zero or more repetations) (A*) Others are Sets of characters (length | strings) ([a - z] or [a - 2]) Q_ One or more (A*) Q Optional (A?) Flex : Flex is a tool that builds a lexical analyzer from a collection of regular expressions and associated actions. Q 61. Explain various types of Compilers. Ans. The various types of Compilers are as follows : 4. One pass Compiler : It passes through the source code of each compilation unt (phase) only once. 2. Multi pass Compiler : It processes the source code of a program several times. Multipass compilers are sometimes called wide compiler. 3. Load and go compiler : Load and go compiler is used in compiling techin which there is no stop between loading and execution. 4. Optimising Compiler : It enhances the performance/reduces the size of the resulting machine program. It optimizes the program code. The optimising compiler require multiple passes in order to analyse the entire program ‘and maximise the use of code throughout. 5. Native Compiler : Native compiler compiles programs for the same architecture of OS that it is running on. Platform is a combination of CPU archiecture and OS. It compiles grams that can be executed on same platform. T-diagram = yque in pros (Target code) © 4 (Source code) 1 (implementation Environment) ; . ' Je 0" 6. Cross Compller : This compiler executes in an environment and generates cod another. jntroduction to Compiler 29 e.g. Microsoft visual sti i ludio includins odie phones, washing machines uses een compiler. Embedded devices such as Q 62. Explain automation in detail compiler. Ans. Automation : It i ee Aoi: h ropesents CPU is a machine responsible for processing the input and generatin : + Program memory. : s : Temporary I Program In finite state automation, temporary memory is absent. Then the structural will loo! tuning machine and PDA. Input k the Letter/digit os Geto © letters and digits that begins with a letter. An identifier can be defined as a string of I n from one state to another state or set of State Automation : An automatic transitio stale on an input. Finite State : Finite state is due to finite no. of state automation. Finite automata OFA NFA ministic finite state automata jefined as = (Q, E, 6, Go F) Q — a finite set of states 5 — a finite alphabet of Vp symbols gg 5 > an initial state FCQ — a set of final states “(ax 5) 2 > a vans Deter! A DFA can be formally dé A Where, tion function Initial State Final State Other State Holeld move Transition! 30 LOD Compiler Design The movement will be from lef to right. No temporary main memory but program memon, having input tape a Open the door Cem Creme) Close the door e.g. Check whether a binary number has even number of zeros or not. Scan a one ‘Scan a zero ‘Scan another zero Even zero : Generally DFA is called a machine M which accept a language. The language is the set of string the machine accept which is denoted as L(M) L(M) = {Binary string having even number of zeros} Non Deterministic Finite state Automata : NFA is a finite state machine where from each state and a given input symbol the automation may jump into several possible next states. This distinguishes it from the DFA, where the next possible state is uniquely determined. e.g. Here, 5=0x>d State Table 0 1 Go | {Go a1} a NFA is represented formally by a 5 tuple (8, 5, 4, gg F) consisting of Q a finite set of states 0 Q a finite set of i/p symbols ¥ Q a transition relation 4:0xE > PO) Q an initial (or start) state qg 50 G asset of states F distinguished as accepting or final states Pc. tntroduction to Compiler 34 Transition function tak i : eS 2 ji Q an input tape a Q Current state eg. 3 (P,0)={q}, E= (0,1) 1 G+ 0 8(P,1) = {P} 8(q, 0) = {P} 5(q, 1) = {a} To generate a language for the automata. 110 }+—{0 0 t Pp A language is a set of string accepted by the finite automata. Q 63, Explain the Minimization of DFA algorithm. Ans, Algorithm : 1. Parfition the set of states into 2 groups © G, : Set of accepting states © G,: Set of non accepting states 2. For each new group G. © portion G into sub groups such that S; & S» are in the same group. © {f for all /p symbols a, © States S, & Sz have transitions to states in the same group. Start state of the minimized DFA is the group containing the start state of the original DFA. Accepting states of the minimized DFA are the groups containing the a ccepting states of the original DFA. From the final DFA, Non final states Final states T,B>Gy C+@ a(T, 0) =B 1 aT, = 8, 58, 1)=CS, 6(B, O) = BS2 a 32 ized DFA is LORDS Compiler Design Groups : {1, 2, 3} {4} LN. {1, 2} {3} (no more partioning) 132 133 232 233 334 333 Q 64. Explain the concept of regular expressions. Ans. We use regular expression to describe tokens of a programming language. A regular expression is built up of simpler regular expressions (using defining rules). Each regular expression denotes a language. A language denoted by a regular expression is called as a regular set. The operations are : Q Coneatenation Q_ Union/alternation Q Kleen closure; Concatenation : Itryeanm=b ty. f2=ab L(r) = {ab} Introduction to Compiler Union : Represented by + or / ipelir Ifry=a and t=b © PReling Then L(r) = {a, b} 33 A regular language is a f ‘ing machine. gt formal language that is accepted by a DFA, NFA and read only A regular language can be it “ r Soar described by a regular expression built on simple regular Operation on languages : Concatenation : LyLp = {81, S & Sp GLy & Sz La} Union : Ly Ly = {SIS eL, or Sel} Exponentiation : LO={e}L1=L L2=LL o =ab, 09 =6,02= Kleene Closure : Concatenate L with itself any no. of time. a Les UL! is L= {a,b} (&} Positive Closure : Concatenate L with itself at least one or more : Lee Uw! L+ = L* — {e} It excludes emply string Q65, Calculate 1st and last POS for each node. (12304 (12.22) * (4) (4) {1.2} (3) {3} AN a b Ht} {2}{2) 41,2)5 LORDS Compiler Design 34 How to Calculate first POS & last POS ? Ans. n Nullable (n) First pos(n) Last pos (n) Leaf labelled True 0 0 & Leaf labelled False {1} {1} with pos@t 1 nullable (C;) first pos (C3) Last pos (C,) /\ or ¥ ¥ eG nullable (C2) first pos (C2) Last pos (Cz) nullable (C;) if (nullable C;) if (nullable Cz) Un. and Ist pos, (C;) u Ist pos (C2) | Last pos (Cx) ¥ . nullable (C2) else Last pos (Cz) else Ist pos (C1) Last pos (C2) Q 66, What is lex specification ? ‘Ans. A lex specification consists of 3 parts : % { regular definitions, C declarations %) % % translation rules Lo % User defined auxillary procedures The translation rules are of the form : P, {action,) Pz, {actiong} P,, {action,} 3 (da) = 4, Transition diagram to generate our identifies if aoa Unit INO Syntax Analysis le of Parser — gonmar Top-Down Parone Ge ania ~ Conete grammars - Wing @ Parearti(t) a Kes ~ General Strategies Recursive Descent Parser Predictive Shift Reduce Parser-LR Parser-LR (0) Item Construction of SLR Parsing Table - Introduction to LALR Parser - Error Handling and Recovery in Syntax Analyzer-YACC. POINTS TO REMEMBER 1, Parser : Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. 2. CFG: Context free grammar. 3. Ambiguous Grammar : A context free grammar for which tree exists a string that can have more than one left-most derivation or parse tree. 4, Dangling else problem : It is an optional else clause in an if then (else) statement tesults in nested conditionals being ambiguous. 5. YACC : A program designed to handle a LALR (i) grammar. Handle : A handle of a string is a substring that matches the right side of a production 6. and whose reduction to the non terminal on the left side of the production represents one step along the reverse of a right most derivation. 7. Predictive Parsing : Predictive parser is a recursive descent parser with no back tracking or back up. 8 Augmented grammar : An augmented grammar is any grammar whose productions are augmented with conditions expressed using features, Abbreviating grammar : A group of productions that have solved left hand side are shown with the left handside written only once. 10. Parse generators : An application which generates a parser, 2 35 36 LORDS Compiler Design QUESTION-ANSWERS Q 1. Define parser. _ Ans. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. It is also termed as parsing. Q 2. Mention the basic issues in parsing. Ans. There are two important issues in parsing which are as follows : 1. Specification of syntax 2. Representation of input after parsing Q 3. Why lexical and syntax analyzers are separated out ? Ans. The following are the reasons for separating the analysis phase into lexical and syntax analyzers. 1. Simpler design 2. Compiler efficiency is improved 3. Compiler portability is enhanced Q 4. Define a context free grammar. Ans. A context free grammar G is a collection of the following = V is the set of non terminals T is a set of terminals S is a start symbol P is a set of production rules G can be represented as G = (V, T, S, P) Production rules are given in the following form, Non terminal > (VUT)* Q 5. Briefly explain the concept of derivation. Ans. Derivation from S means generation of string W from S. For constructing derivation two things are important. (i) Choice of non terminal from several others. (i) Choice of rule from production rules for corresponding non terminal. Instead of choosing the arbitrary non terminal one can choose. (i) Either leftmost derivation ~ leftmost non terminal in a sentinel form. (ii) Or right most derivation — rightmost non terminal in a sentinel form. Q 6. Define ambiguous grammar. Ans. A grammar G is said to be ambiguous if it generates more than one parse tree for some sentence of language L(G) i.e. both leftmost and rightmost derivations are same for the given sentence. Q 7. What is an operator procedence parser ? ‘Ans. A grammar is said to be operator if i : bn 1. Né produedan on Jhisright i is ° procedence if it possess the following properties syntax Analysis 2, There should not be an right hand side. Q 8. List the properties of L R Ans, 1. LR parser can be cons eae for which the context free gramma; 2, The class of grammar that can be pars: li fatidan'bo parsed using predictive ae ed LR parser is a Superset of class of grammars . LR parsers 4 a pee i metpee La ick tacking Shift reduce technique yet it is efficient one Ans. Various types of LR parser are as follows : 1. SLR parser — Simple LR parser 2, LALR parser - Lookahead LR Parser, 3. Canonical LR parser, Q 10, What are the problems with top down parsing ? Ans. The following are the problems associated with top down parsing. 1. Back tracking 2. Left recursion 3. Left factoring 4. Ambiguity Q 11. Write the algorithm for FIRST and FOLLOW. Ans, First : 1. Ifx is terminal, then FIRST (x) is {x}. 2. If xe is a production, then add c to first (x) 3. fx is non terminal and x > ys, Y2--» Yk is @ production, then place a in FIRST (x) If for some i, a is in FIRST (y,), and c is in all of FIRST (y,),..... FIRST (yi -1); Follow : 1. Place $ in FOLLOW (S), where S is the start symbol and $ is the input right end marker. 2. Ifthere is a production A > aBf, then everything in FIRST (f) except for ¢ is placed in FOLLOW (B) ; 3. If there is a production A + aB, or a production A -» aBf where FIRST (f) contains , then everything in FOLLOW (A) is in FOLLOW (8). Q 12, List the advantages and disadvantages of operator precedence parsing, Ans. Advantages : This type of parsing is simple to implement Disadvantages : 1. The operator like minus has two diferent precedence (unary and binary). Hence it is hard to handle tokens like minus sign. -2.. This kind of parsing is applicable to only small class of grammars, Q43, What is dangling else problem ? ae anes ‘Ambiguity can be eliminated by means of dangling else grammar which is shown Y prod We pean “elion rule possessing two adjacent ‘structed to recognize mos st of the programming languages T can be written, os A 38 LORDS Compiler stmt — If expr then stmt <2 | If expr then stmt else stmt | other Q 14. Write short notes on YACC. Ans. YACC is an automatic tool for generating the parser program. YACC stands for Another compiler Compiler which is basically the utility available for UNIX. i Basically, YACC is LALR parser generator. It can report conflict or ambiguities in the form of error messages. Q 15, What is meant by handle pruning ? Ans. A rightmost derivation in reverse can be obtained by handle pruning. If w is a Sentence of the grammar at hand, then @ = yn, where 4n is the nth right sentential form of some as yet unknown right most derivation. S=Pay..2PlopP=o Q 16. Define LR (0) items. Ans. An LR (0) item of a grammar G is a production of G with a dot at some position of the right side. Thus, production A- xyz yields the four items, AW exyz A xeyz A> xyez Aa xyze Q 17. What is meant by viable prefixes ? Ans. The set of prefixes of right sentential forms that can appear on the stack of a shift reduce parser are called viable prefixes. An equivalent definition of a viable prefix is that itis a prefix of a right sentential form that does not continue past the right end of the right most handle of that sentential form. Q 18. Define handle. Ans. A handle of a string is a substring that matches the right side of a production, and whose reduction to the non terminal on the left side of the production represents one ste? along the reverse of a right most derivation. ‘A handle of a right sentential form 7 is a production A+B and a position of 7 where the string B may be found and replaced by A to produce the previous right sentential form in 6 right most derivation of y. That is, If S=>a Aw, then AB 'in the position following ais a handle of afw. Q 19. What are kernel & non kernel items ? Ans. Kernel items, which include the initial item, S'S and alll items whose dots are not at the left end. Non-Kernel Items, which have their dots at the left ends. Q 20. What is phrase level error recovery ? Ans. Phrase level error recovery is implemented by filling in the blank entries in the predictive parsing table with pointers to error routines. These routines may change, insert, delete symbols on the input and issue appropriate error messges. They may also pop ff the stack. syntax Analysis 39 Ans. Topdown parsing : To, Fete Gom the top of the — tron, oth Parsing technique is a parsing technique which Bottom up parsing : pia ove downwards, evaluates rules of grammar. stare fromthe lowest level of rie UP parsing technique is again a parsing technige which Following are _ 4 a tree, move upwards and evaluates rules of grammar. @ Of the impo! ; : sg Portant differences between Top Down Parsing and Bottom Seehe. re | 2 ap Parsing [ Bottom up parsing M approach starts | Bottom up approach staris evaluating the parse tree _| evaluating the parse tree from the top & move down- | from the lowest level of the wards for parsing other tree and move upwards for nodes parsing the node 2. Attempt Top down parsing attempts | Bottom up parsing attempts to to find the left most derivation} reduce the input string to first for a given string. symbol of the grammar. 3. | Derivation | Top down parsing uses Botiom up parsing uses the Type leftmost derivation. right most derivation. 4. | Objective | Top down parsing searches | Botiom up parsing searches for a production rule to be _| for a production rule to be used used to construct a string. | to reduce a string to get a starting symbol of grammar. Q 22. What are LR parsers ? Explain with a diagram the LR parsing algorithm. ‘Ans. LR Parser : The LR parser is a non-recursive, shift reduce bottomup parser. It uses a wide class of context free grammar which makes it the most efficient syntax analysis fechnique, LR parsers are also known as LA(K) parsers, where L stands from left to right scanning of the input stream; R stands for the construction of right most derivation in reverse and K denotes the number of lookahead symbols to make decisions. There are three widely used algorithm available for constructing an LR parser. 1. SLR (1) - Simple LR parser : O Works on smallest class of grammar OQ Few number of states, hence very small table Q Simple & fast construction. . LR (1) - LR parser | Works on complete set of LR (1) grammar. i 1 Generates large table and large number of states © Slow construction. . LALR (1) Look - Ahead LR parser } 0 Works on intermediate size of grammar Q Number of states are same as in SLR (1) » 2 ~a 40 LORDS Compiler Design LR parsing Algorithm token = next - token (1) repeat forever S = top of stack If action [S, token] = “shift si” then PUSH token PUSH Si token = next-token () else if action [S, token] = “reduce Pop 2°|B| symbols S = top of stack PUSHA PUSH goto [S, A] else if action [S, token] = “accept” then return else error () Q 23. What are parser generators ? Ans. It produces syntax analyzers (parsers) from the input that is based on a grammatical description of programming language or on a context free grammar. It is useful as the syntax analysis phase is highly complex and consumes more manual and compilation time. Eg : PIC, EQM. tokens + ‘Syntax. — generator analyzer Parse tree. Q 24, Explain recursive descent parser, Ans. A parser that uses a set of recursive procedures to recognize Its input with no back tracking is called a recursive descent parser. For implementing a recursive descent parser for a grammar. Q The grammar must not be left recursive. Q The grammar must be left factored that means it should not have common prefixes for a iternates. O We need a language that has recursion facility. Left factoring problems : Eliminate left recursion and then left factor the following grammar ; reexpr — reexpr + reterm|rterm. term —» rterm rfactor|rfactor factor — rfactor’|rprimary rprimary-+ alb | al yt Analysis at Design : Q It can be written on the same li 1¢ line: is procedure for one non terminal 'S as that of brute force approach that is one Q The procedures need not return i sinvbrile force, anything because they need not try another alternate oO ow haa : call some error routine to take care of the error. One can use of different type of transition diagrams for designing a recursive scent parser aS used in case of scanners. The differences are : Q There is one transition diagram for one non terminal. Q The labels for edges may be terminals or non-terminals. Label is a terminal which measn we must take that transition if the néxt in det put symbol is terminal. ‘After elimination of left recursion and left factoring one can construct the transition diagram for the grammar as, For each non-terminal A, Q Create an initial and final states. For each production A -> Xs, Xo edges labelled X, X2---%n- Q 25. Consider the given grammar : E>E+T/T To Fx TF Fo id Evaluate the following expressions In accordance with the given grammar. 243x5x642, ‘Ans. Let us draw a pars Q Evaluating the parse t! The parse tree for the give! . Xp» oreate a path from initial to the final state with f tree for the given expression. ree will return the required value of the expression. n expression is : ‘On evaluating this parse tree, we get the value = 94. a 42 - LORDS Compiler des, Q 26. Consider the given grammar : toad E>SE+TE-TT ToT x F/T + FF F GTFIG Goid Evaluate the following expression in accordance wit 2x14+4T27T1x143 Ans. Let us draw a parse tree for the given expression. Evaluating the Parse tree return the required value of the expression. The parse tree for the given expression is h the given grammar, a ee aN 1 : i F i 2 F oo id 1 AS @ ite i i id a On evaluating this parse tree, we get the value = 21. Q 27. Calculate the first and follow functions fort S > aBDh. BocC C>bCle D=EF E> gle Fo fle Ans. The first and follow functions are as follows : First Functions : Q First(S) = {a} Q First(B) = {c} Q First (C) = {b,<} Q First (D) = (First (<) — e} u First (F) he given grammar : = {9 f, e} syntax Analysis O First (e) = (9, €} Q First (F) = {f, €} Follow functions : Q Follow (S) = ($) Q Follow (B) = {First (D) - €} u Firs = Q Follow (C) = Follow eee f a Moth Q Follow (D) = First (h) = {h} a a Qa 43 Follow (E) = (First (F) - ¢} U Follow (0) ~ {f, h} Follow (F) = Follow (D) = {h} 28. Calculate the first and follow functions for the given grammar : E>ETINT ToOTXxFIF F = (E\id Ans. We have : Q The given grammar is left recursive. OQ. So, we first remove left recursion from the given grammar. After eliminating left recursion, we get the following grammar : E> TE E'++TEVe T= FT Tox FT/e F = (fd Now, the first and follow functions are as follows : First: Functions O First (E) = First (T) = First (F) = {(, id} Q First (E') = (+,¢} O First (T) = First(F) = {(. id} Q First (TY = {x.<} O First (F) = {(, id) Follow functions Q Follow (e) = (S; )} 0 Follow (<’) = Follow (E) = {S: )} Follow (T) = (First (E') - E) v Follow (e) U Follow (e') = {+ §, )} Follow (T) = Follow (T) = {+ $)} D Follow (1) = (First (T) ~ €} Follow (T) v Follow (T) = (x, + $, )} Q 29. What is the role of parser ? Ans, Once a token is generated by the lexical analyzer, it is passed to the parser. On teceiving token, the parser verifies the string to token names that can be generated by the grammar of source language. It calls the function get Next Token (), to notify the lexical Analyzer to yield another token. It scans the token one at a time from left to right to construct the parse tree. It also checks the syntactic constructs of the grammar. 44 LORDS Compiler De. Ig) Q 30. What are various error recovery strategies in lexical phase ? Ans. Error recovery strategies are used by the parser to recover from errors onog it detected. The simplest recovery strategy is to quit parsing with an error message for th j, error itself. : Panic Mode Recovery : Once an error is found, the parser intends to find designst, set of synchronizing token by discarding input symbols one at a time. Synchronizing token are delimiters, semicolon or } where role in source program is clear. When parser finds 3, error in the statement, it ignores the rest of the statement by not processing the input. This j the easiest way of error-recovery. It prevents the parser from developing infinite loops Advantages : Q Simplicity Never get into infinite loop Disadvantages : Q Additional errors cannot be checked as some of the input symbols will be skipped, Phrase level Recovery : Parser performs local correction on the remaining input wh an error is detected. When a parser finds on error, it tries to take corrective measures so tha the rest of inputs of statement allow the parser to parse ahead. One wrong correction wi lead to an infinite loop. The local correction may be : Q Replacing a prefix by some string. Replacing comma by semicolon. 0 Deleting extraneous semicolon. O Insert missing semicolon. Advantage : Q It can correct any input string. Disadvantage : Q Itis difficult to cope up with actual error if it has occured before the point of detection Error production : Productions which generate erroreous constructs are augmented to the grammar by considering common errors that occur. These productions detect the anticipated errors during parsing. Error diagnostics about the erroneous constructs ar generated by the parser. Global Correction : There are algorithms which make changes to modify an incorrect string into a correct string. These algorithms perform minimal sequence of changes to obtait globally least cost‘correction. When a grammar G and an incorrect string “pis” given, thes? algorithms find a parse tree for a string “q" related “top” with smaller number of transformations The transformations may be insertions, deletions and change of tokens. Advantages : Q thas been used for phrase level recovery to find optimal replacement strings. Disadvantages : Q This strategy is too costly to.implement in terms of time and space. Q 31. What is eter handling ? Explain error recovery in syntax analyzer. Ans. Error handling : An efficient program should not terminate on parse error, It mus! Pres. Analysis oe 45 to parse the rest of the inp —_ a saree f van be eee has for subsequent errors. For one line input, the YACC program error handling : Parser eon eae eee ‘ . pable by deteting the error as soon gait encounters, L.e., when an input stream does not match the rules in grammar. If there is an erfor-handling subroutine in the grammar file, the parser can allow for entering the data again, ignoring the bad date or initiating a clean up and recovery action. When the parser finds an error, it may need to reclaim parse tree storage, delete or alter symbol table entries and set switches to avoid generating further output. Error handling routines are used to restart the parser to continue its process even after the occurrence of error. Tokens following the error get discarded to restart the parser. The YACC command uses a special token name error, for error handling. The token is placed at places where error might occur $0 that it provides a recovery ‘subroutine. To prevent subsequent occurrence of errors, the pa processes three token following an error. The inputis discar It an error occurred while the parser remains in error state. eg. Stat: error’:” Q The above rule tell token and ail following token until it finds the next semicolon. ror and before the next semicolon. by parser and cleanup action associated rser remains in error state unit! it ded and no message is produced. is the parser that when there is an error, it should ignore the D It discards all the tokens after the er Once semicolon is found, the rule is reduced with that rule will be performed. Providing for error correction The input errors can be correctet input : error’ { i by entering a line in the data stream again. print f (“Reenter last line”); } input { SS = $4; yi The YACC statement, yyerror k is used to indicate that error recovery is complete. This statement leaves the error state and begins processing normally. input : error‘\n’ yyerrork; printf (“Reenter last line"); } input { S$ = 94; K 46 LOAD Compiler pe, Clearing the lookahead token QQ When an error occurs, the lookahead token becomes the token at which the e,,,/ was detected. (The lookahead token must be changed if the error recovery action includes cogs | find the correct place to start processing again. u Q To clear the lookahead token, the error - recovery action issues the follo,| | statement : YYclearin; To assist in error handlong, macros can be placed in YACC actions Macors for error handling | YYERROR Causes the parser to initiate error handling YYABORT Causes the error to return with a value of 1 { YYACCEPT Causes the parser to return with a values of 0. YYRECOVERING () Return a value of 1 if a syntax error has been detected an the parser has not yet fully recovered Q 32. What is the difference between CLR (i) and LALR (1) parser ? Ans. Difference between CLR (1) and LALR (1) : y Q_ No. of states in CLR (1) is greater than or equal to LALR (1) parser. i If grammar is CLR (1), it may or may not be LALR (1) because contflict may ats: when you will merge the states for constructing LALR (1) table. | If grammaris not CLR (1), it will not be LALR (1), because already conflict happene:| in CLR (1) so the conflict will always present in LALR (1). If there is no SR conflict in CLR (1) there will be no SR conflict in LARL (1) If there is no RR conflict in CLR (1), we can't say about RR conflict in LALR (1) because there is a chance that we merge the states, RR conflict may arise. | @ 33. What is predictive parsing ? What Is the difference between Recursit® | predictive descent parser and non recursive predictive descent parser. Ans. Predictive parser is a recursive descent parser, which has the capability to pred! which production is to be used to replace the input string. The predictive parser does "™ | suffer from back tracking. Predictive parsing uses a stack and a parsing table to parse input and generate a parse tree. ‘ 1. Recursive predictive descent parser : Recursive descent parser is a top dott | en tara ate cams poedue ed We ) ; a grammar. Here we conside! simple form of recursive descent parsing called predictive recursive descent in whi" lookahead symbol unambiguously determines flow of control through cosa body (0! each non terminal. The sequence of procedure calls during analysis a rourseae ici” defines a parse free for input and can be used to bui a ‘ fo build an explicit 9d. recursive descent parsing, parser may have more than one sche ae eet ee single instance of input there concept of back tracking cee Seaey meee syntax. Analysis 47 Back-tracking : It means, i ivatic stscn vlrg dlitrrant ee : in one derivation a production fails, syntax analyzer restarts A pan anos do'dstarine feel This technique may process input string match input string against paar cee Start from root nade and 2, Non-Recursive predictive Descent parser : A form of recursive-descent parsing that does not require any back tracking is known as predictive parsing. It is also called as LL(4) parsing table techniques. Since we would be building a table for string to be parsed. It has capability to predict which production is to be used to replace input string. To — its tasks, predictive parser uses a look ahead pointer which points to next input symbols. To make parser back tracking free, predictive parser puts some constraints on grammar and accepts only, a class of grammar known as LL(K) grammar. Difference between Recursive Pre Dictive Descent parser and Non-Recursive predictive - Descent parser : Non Recursive predictive Descent Parser Recursive Predictive Descent Parser 7 This a technique which may of may not | 1. It is a technique that does not require require backtracking process. any kind of back tracking. 2. it uses procedures for every non | 2. It finds out productions to use by replacing input string. itis a type of top down approach, which is also a type of recursive parsing that does not uses technique of back tracking. The predictive parser uses a look ahead pointer which points to next input grammar. symbols to make it parser back tracking free, predictive parser puts some constraints on grammar. 5. It accpets only a class of grammar known as LL(K) grammer. terminal entity to parse strings. 3. Itis a type of top down parsing built from | 3. a set of mutually recursive procedures where each procedure implements one of non terminal S of grammar. 4, It contains several small. Small | 4. functions on for each non terminal in 5, It accepts all kinds of grammars. Q 34, What are the capabilities of CFG? Ans. There are various capabilities of CFG : 1, Context free grammar is useful to describe most of the programing languages. B ifahe grammar is properly designed then an efficient parser can be constructed automatically. 4. Using the features of associatively and precedence information, suitable grammars for expressions can be constructed. Context free grammer is capable of describing nested sti 7 ructures like bi parenthesis, matching begin end, corresponding if then else’s and so on, alanced 48 LORDS Compiler de: Q 35. What is ambiguity ? Ans. A grammar is said to be ambiguous if there exists more than one left most deriy ‘or more than one right most derivative or more than one parse tree for the given input If the grammar is not ambiguous then it is called unambiguous. Eg: S=asb/Ss See For the string aabb, the above grammer generates two parse trees. aN i a Z| N LYN MN “N | If the grammar has ambiguity then itis not good for the compiler construction. No methe: can automatically detect and remove the ambiguity but you can remove ambiguity by rewritir; the whole grammar without embiguity. | Q 36. What is augmented grammar ? a ‘Ans. Augmented grammar ‘G’ will be generated if we add one more production int given grammar G. It helps the parser to identify when to stop the parsing and announce ti acceptance of the input. i eg. Given grammar S>AA A> aAb The augmented grammar ‘G’ is represented by ss SAA A aAlb Q 37. What is abbreviating grammars ? Ans. A group of productions that have the same left hand side are shown with the we hand side written only once. In all but the first production in the group, write/instead of >Fe example : ze above grammar for expression is usually written as follows : on [E+E |EtE IE) : Slee a a eemenneer fo rewrite. For example : iat always selects the left most non syntax Analysis 49 E>E+E =neeE =n+E°E =n+(E)"E =n4(n) te =n+(n)*n is a left most derivation of n + (n) * n. A right most derivation is a derivation that always selects the right most non terminal to rewrite. For example : E>E+E =>E+E°E >E+Etn =E+(E)in >E+(n)'n Sne(a)tn is a right most derivation of n + (n) * n. 39. What is a parse tree ? Give an example. Ans. A parse tree is a graphical way to show a derival point in the derivation, then there is a node labelled N that hi non terminal in a For example : Let's use the following derivation of following expres: E>E‘E ante =on+E*E an+()te =n+(n)te => n+(n)*n A corresponding parse tree is as follows : tion. If rule Na is used at some as a subtree for each token or sion : £ | a @ 40. Explain LL (1) parser with the help of example. Ans. A top down parser that uses a one-token lookah n pa lead The first L indicates that the input is read from left to oe Sh eee The second L says that it produces a left to right derivation, és LORDS Compiler Dec, Q And the 1 says that it used one lookhead token. The LL(1) Parsing table : The parser needs to find a production to use for non tenn, N when it sees lookhead token t. To select which production to use, it suffices to have a ta that has, as a key, a pair (N, 1) and gives the number of a’ production to use. Lets takes an example with an LL(1) parsing table for the expression grammars. 1.E>0TR 2R7e 3.R>+E 4.T+FS 5.S 38 6.S>°T Fon 8.F > (0) : Parsing table D below does this Job. Each row is labelled by anon terminal and ea column by a lookahead token, or the special symbol. $ that indicates the end of the input D (Nt) is the production to use to expand N when the lookahead is t, Blank entries me syntax error. Table D me ee lee E[4 1 R 3 | 2 Fae Tal 4 4 s 5/6 5 [5 F [7 8 Now it is easy to use the table to control a topdown parse. Parsing n*n goes as follow Start with E - D(E,n) = 1, so expand E using production 1 A\ a Since D(T,n) - 4, we continue by expanding T to FS. gE /N\ T R oN, F s Syntax Analysis i — Now D(F, n) = 7 and production 7 is F >» N Ff oN 1 R a j 8 a The lookahead changes to * and D(S,") = 6, Since production 5 is S ~» °T, the tate wis us to replace S by * T. ZN “ ~ LA a Now the lookahead is n, and D(T, n) = 4, After using production 4 (T-+Fa), the table wil tell us to use prouction 7 (Fn), giving YN a ~ I 7 "YN F 8 ° The parse is almost finished, Since there are no more tokens, the lookahead is $. D(S,S) = 8 and D(A, $) = 2, which says to replace S and R by E, be /N x »~ | [AN eas | 8. LORDS Compiler 52 © 41. What do you mean by parsing conflict ? ‘Ans. Parsing confict is defined with the help of following example = First build the topdown parsing table for a slightly different grammar tl—oe 2LoN 3.NE 4.N>E,N 5.E>n The first and follows sets for that grammer are : x TeITNie First(x) | {ne} | {rm} | {nr} Follow (x) | (S}_ | {S}_ | 4S} The parsing table is as follows = Table D n Ss L 2 1 N 3,4 E 6 The algorithm adds fwo productions to one of the cells in the table. What that happen it is called a parsing conflict. ‘ser cannot know whether the production N->E or NE, The problem is that the pai based on a single token lookahead. That decision depends on whether the E is followed by comma, and that would require a longer lookahead. Q 42. Why top down parsers cannot handle left recursion. ‘Ans. A top down parser cannot handle left recursion productions. To understand wh not, let's take away simple left recursive grammar. 1.S 7a 2.8 Sa There is only one token a and only one non-terminal; S. So the parsing table has t ‘one entry. Both productions must go into that one table entry. The problem is that, on lookhe a, the parser cannot know if another a comes after the lookahead. But the decision of wh production to use depends on that information. Q 43. Explain stuff reduce parsing. Ans. A stuff reduce parser keeps track of two things : 4. The remaining, unread, part of the input. 2. Asstack that holds tokens and non terminals. The handle is always the top o more symbols in the stack. There are two main kinds of actions : 1. A shift action moves a token from the input to the top of the stack. 2. A reduce action finds a handle on the stack and a production N-va, and fe place ag syntax Analysis 53 by N. There are also two minor actions : 1. An accept indicates that the parser has successfully found a derivation. 2. An error action indicates a syntax error. ae What is SLR (1) parsing ? Explain the steps used to construct SLR (1) table? Ans. SLR (1) refers to simple, LR parsing. Itis same as LR (0) parsing. The only difference is in the parsing table. To construct SLR (1) parsing table, we used canonical collection of LR(0) item. In the SLR (1) parsing, we place the reduce move only in the follow of left hand side. Various steps involved in the SLR(1) parsing are : Q For the given input string write a context free grammar. Q Check the ambiguity of the grammer. Add augment production in the given grammar. Q Create canonical collection of LR(O) items. Q Draw a data flow diagram (DFA). Q Construct a SLR (1) parsing table. SLR (1) Table construction : The steps which use to construct SLR (1) Table is given belo / if state ({) is going to some other state (jj) on a terminal then it corresponds to a shift move in the action part. States] Action | Goto i Ts 4 i Ifa state (/) is going to some other state (j) on a variable then it correspond to go to move in the Go to part, f re Ifa state (i) contains the final item like A > ab. Which has no transitions to the next state then the production is known as reduce production. For all terminals X in FOLLOW (A), write the reduce entry along with their production numbers. @ 45. What is the output of syntax analysis phase ? What are the three general types of parsers for grammars 7 ‘Ans, Parser or parse tree is the output of syntax analysis phase. General types of parsers: 1. Universal parsing 2. Top down 3. Bottom up Q 46, What are the goals of error handler in a parser ? Ans, The ertor handler in a parser has simple to state goals : © it should report the presence of errors clearly and accurately, a LORDS Compiler Design Q It should recover from each error quickly enough to be able to detect subsequence errors. Q It should not significantly slow down the processing of correct programs. Q 47. Define context free language. When will you say that two CFG's are equal ? Ans. A language that can be generated by a grammar is said to be a context free language. If two grammars generate the same language, the grammars are said to be equivalent. Q 48. Left factor the following grammar. S SiEts/EtSeS/a E>b ‘Ans. The left factored grammar is S=iEtSS'a S'> eSle E>b Q 49. What are the disadvantages of operator procedence parsing? Ans. Disadvantages of operator procedence parsing : (i) It is hard to handle token like the minus sign, which has two different precedences. (ii) Since the relationship between a grammar for the language being parsed and operator precedence parser is tenuous, one cannot always be sure the parser accepts exactly the desired language. (iii) Only a small class of grammars can be parsed using operator precedence techniques. Q50. What are kernel and non kernel Items ? + Ans. The set of items which include the initial item, S..S, and all items whose dots are not at the left end are known as kernel items. The set of items, which have their dots at the left end, are known as non kernel items. Q 51. What is the difference between syntax analysis and lexical analysis . Ans. Difference between syntax analysis and lexical analysis : 1, Separating the syntactic structure of a language into lexical and non lexical parts provides a convenient way of modularising front end of compiler. 2. Regular expressions are concise and easy to understand notations for tokens. 3. By using regular expression, construct like. identifiers;,keywords, constants and whitespaces can be described in a proper manner. How ever, CFG are used for describing nested: string as if then else, balanced parenthesis etc. Q 52. Define ambigulty with the help of example. Ans. Derivation of parse tree : It follows 2 ways : Q _Leftmost derivation, un Q_ Rightmost derivation, rm oa 7] IN E=E+E SSE +id > idsia | | > SAS id+E syntax Analysis 55 Ss ids id If there exists 2 or more parser tree for a given CFG, then grammar is said to be ambiguous, needs to be resolved. eg: E>E+E/E*E|E/E|CE}I id. This is an ambiguous grammar. To resolve precedence associativity need to be considered. E>+E+T/T tt oe F > (E)id T= Term F = Factor Q 53. Explain Dangling If else problem with the help of an example. Ans. Let the CFG be Syntax : stint — If expr then stint If expr then stint else stint Other stint + terminal. 1 e.g. If E, then if E, then S, else Sp. Draw a parse tree of the given problem by applying CFG. Stint © exp Che Stint expr then t ene Parse tree contains every tokens generator by the grammar. Stint Siint “else “stint ss ». The above given grammar is ambiguous. 86 LOADS Compller Design To eliminate ambiguity the grammar need to be left factorial. - ok sist SelietsSeSla = a E>b Ssic+5 Sia. Si sleS E>b Q54. ETE’ E>+TENE To FT T° FTXE F > (s)iid Ans. For the above CFG, (w = id + id * id}, generate a parse tree using left derivation, Q 55. Explain various types of parsing techniques. Ans. There are two types of parsing techniques : 4. Top down (Left most derivation) 2. Bottom up (Right most derivation) 4. Top down parsing : It generates parse tree starting from the root & creating nodes of the tree in pre order. It can be viewed as finding leftmost derivation of an input string. Top down Non Recursive Recursive descent predictive parsing Parsing with backtracking Predictive parsing without back tracking (to overcome recursive descent parsing with backtracking) (i) Recursive Decent Parsing : For every non terminal, a procedure is present andi needs to be called. Algorithm. Void A() { 4 Choose a A ~ production A > x,, Xp for (i= 1 tok) { if (xis a non terminal) ‘Syntax Analysis 87 call procedure x((); else if (x; == Current i/p symbol) advance to the next i/p symbol else /* report error */ } } A recursive descent parsing consists of a set procedures (one of each non terminal) Execution begins with the procedure for start symbol and annonuces a success if its procedure body scans the entire i/p string, Recursive descent parsing may require backtracking ie. repeated scanned over input. eg. S—>CAd A= ab/a W- Cad Sol. Algorithm matches the production S + CAd wihW= Cad (given string) ¥ | X, xX, X When x, = A is encountered, it uses another algorithm to make use of the production A abla When a is encountered, the loop is entered again to encounter b. But ‘ab’ doesn't match with ‘a’ in W = Cad, therefore, backtracking occurs. Then the next production that Ps, A > @ is advanced to. It matches than i's value is incremented and d is also matched. It is based on the concept of D Fs (Depth first search) (i) Non Recursive Predictive Parsing : can be bul by maintaining a stack explicitely The parser may mix/induces a left most derivation of W is the input that has been matched so far then the stack holds a sequence of grammar symbols a such that S ==> Wa The parser is table derivation. It uses predictive parsing table M. The table driven parses buffer, a stack contains a sequence of grammar symbols, a parsing table and an - Predictive Parsing has an i/p input string. fo Program Parsing Table M 58 for an input string. It begins at the also known as shift reduce pa LORDS Compiler Design (i) Set the input point to the first symbol of W. Where W > ifp string provided (ii) Set x to the ToS (iii) While (x! = $) (iv) If (is a) pop the stack and advance the input ip. (v)_ Else if (x is a terminal) error 05 (vi) Else if (M (x, a] = x > Y1 Ya - yn) (vii) {// the production x > Yr» Ya Yi (ily Push Yjo Yeer =~ Ys onto the (ix) Stack such that y; 1s on the top. (x) Set x to the TOS symbol (xi) Ends while end while. 2. Bottom up parsing : Bottom up parsing corre leaves and work up { rser, Shift reduce parsers are of (Operator proedence parser : It is applied to small clas OLR parsers : It scans the input string from left to right derivation tree. Q56.W=ld+id*id sponds to construction of a parse tree owards the root. Bottom up parser are two types : ss of grammar. and generates a right most w«(elL? lel [els [Parsing Progam | +0 8 ; MIA. al Ans. Matched Stack vp O/P Action ES ded dS TES id-+id* id S E>TE FTE'S id-+ id *id To FT id id TES id + id * id F> id Ui + id ids Matched +id* ids TE mt sale : +id" 168 esate ice id * idS Matched $ id * ids T> FT E> EsTT To TRF F -» (EMd MIE, Pd) = & > TE! id + id id ids Fo “dg matched id + id * FTE'S “ids TFT FTES id matched id + Id * id id TES id $ Foid TES $ matched E'S $ Toe $ $ Ese Q 57. What are the various types of errors? ‘Ans. Common programming errors can occur at different levels of compilation. They aro as follows : 1, Lexical Errors : This includes misspellings of identifiers, keywords or operators. 2, Syntactial : missing semicolon or unbalanced parenthesis. 3. Semantical : Incompatible value assignment or type mismatches between operator and operand, 4, Logical : Code not reachable, infinite loop @ 58, What are the common error recovery methods which can be implemented in the parser? ‘Ans. There are five common error recovery methods which can be implemented in the parser. 4. Statement mode recovery : In the case when the parser encounters an error, it helps you to take corrective steps. This allows rest of inputs and states to parse ahead e.g : Adding a missing semicolon. 2, Panic mode recovery : In the case when the parser encounters on error, thus mode ignores the rest of the statement & not process input from erroneous input to delimiter, like a semicolon. This is a simple error recovery method. ' 3. Phrase level Recovery : Compiler corrects the program by i y inserting or ir tokens. This allows it to proceed to parse from where it was. It performs cat ws temaining input. It can replace a prefix of the remaining input with some string this oie parser to continue the process. 9 this helps the 4; Error productions : Error production rect Overy ex; which generates the erroneous constructs. The fone then eee te language that construct. en Performs error diagnostic about 60 ___ LORDS Compiler Design 5. Global Correction : The compilor should mako lose numbor of changes as possible while processing an incorrect input string Givon incorroct input string ‘a’ and grammar ‘c’ algorithm will soarch for a parse treo for a rolated string ‘b’. Liko somo insertions, dolotions and modifications made of tokens needed to transform ‘an’ into ‘b' is as little as possible, Q 59. Using the Input string id * Id , define the steps of action performed by shift reduce parser, using a stack, where the stack initially has ond marker $. Ans. Stack Input $ id, * id, $ id, $ id, * idy $ Fe ids 60, What is YACC? How does this YACC works? Explain the difference between LEX and YACC. ‘ans. YACC is officially known as a ‘parser. Its jab is to analyse the structure of the input stream and operate of the “big picture”. In the course of its normal work, the parser also Verifies that the input is syntactically sound. Consider the example of a c-compiler. In the C- language, a word can be a function name or a variable, depending on whether it is followed by a (or a = there showed exaclly one } for each { in the program. YACC stands for “yet another compiler compiler”. This is because this kind of analysis of text files is normally associated with writing compilers. YACC ee a Specification Compiler y: c ‘Compiler —> a.out Input ‘cut on How does this YACC works? ‘As YACC is designed for use with C code and generates a parser written in C. The parser is configured for use in conjunction with a lex-generated scanner and relies on standard shared features (token types, YY/val, etc) and calls the function yylex as a scanni routine you provide a grammar specification file which is traditionally named using a. sta af you Invoke YACC on the y file and it creates the y.tab.h and y.tab.c. files ating a ‘housed or

You might also like