You are on page 1of 16

CS346: Compilers

Instructor: Dr. Arnab Sarkar TA: Mr. Satish Kumar Book: Compilers: Principles, Tools and Techniques. Aho, Lam, Sethi, Ullman
10-Jan-13 CSE346: Compilers, IIT Guwahati 1

IIT Guwahati .Lecture #1 Introduction CSE346: Compilers.

C is compiled Shell scripts are interpreted Java is both compiled and interpreted CSE346: Compilers. IIT Guwahati 3 • Examples • • • 10-Jan-13 .Introduction • Compiler A program that reads a program written in one language (source language) and translates it into an equivalent program in another language (target language) (Aho et al) • Interpreter • A program that reads a source program and produces the results of executing this source.

Compilation Principles • The compiler must preserve the meaning of the program being compiled. IIT Guwahati 4 10-Jan-13 . • Other issues: • • • • • • Speed Space Energy Efficiency Feedback (Information returned back to the user) Debugging Compilation time efficiency CSE346: Compilers.

Builds a symbol table to collect and store information about source program. IIT Guwahati 5 • • • 10-Jan-13 . Using this. phases? CSE346: Compilers. Checks for syntax / semantics and provides informative messages back to user in case of errors. creates a generic Intermediate Representation (IR) of the source code.Structure of a Compiler Source code • Front-End Intermediate Representation Back-End Target code Front-end performs the analysis of the source language: • Breaks up the source code into pieces and imposes a grammatical structure.

Structure of a Compiler Source code Front-End Intermediate Representation Back-End Target code • Back-end does the target language synthesis: • Builds the desired target program from the IR and information in the symbol table. • Why do we need to separate the compilation process into front-end and back-end phases? 10-Jan-13 CSE346: Compilers. IIT Guwahati 6 .

IIT Guwahati 7 • • • • Back-End • • • 10-Jan-13 . Target Code Optimization: Improve the resulting structure. Syntax analysis (Parsing): Identify how those pieces relate to each other. CSE346: Compilers. IR Generation: Design one possible structure. IR Optimization: Simplify the intended structure.Structure of a Compiler • Front-End • Lexical analysis: Scan the source program and identify logical pieces of the description. Semantic analysis: Identify the meaning of the overall structure. Target Code Generation: Fabricate the structure.

Attribute> and passes it to the next phase syntax analysis Token-class: The symbol for this token to be used during syntax analysis Attribute: Points to symbol table entry for this token E.: The tokens for: fin = ini + rate * 60 are: • Symbol Table 1 2 3 … 10-Jan-13 CSE346: Compilers.g. 1> <=> <id.Lexical Analysis • Reads characters in the source program and groups them into meaningful sequences called lexemes. IIT Guwahati • • • • <id. 2> <+> <id. 3> <*> <60> fin ini rate … … … … … 8 . Produces as output a token of the form <Token-class.

IIT Guwahati 9 10-Jan-13 . This IR depicts the grammatical structure of the token stream. The IR is usually represented as syntax trees • • • • Interior nodes represent operation Leaves depict the arguments of the operation CSE346: Compilers.Syntax Analysis (Parsing) • Creates a tree-like IR using the tokens produced by the lexical analyser.

2> <+> <id.2> <id.: Parse tree for: fin = ini + rate * 60 Tokens: <id. IIT Guwahati + * 60 10 . 1> <=> <id.1> <id.g.Syntax Analysis (Parsing) • • E. 3> <*> <60> = <id.3> 10-Jan-13 CSE346: Compilers.

Eg: int ary[10]. etc. incompatible operands. Annotates nodes of the tree with the results. undeclared variable. 10-Jan-13 CSE346: Compilers. function called with improper arguments. IIT Guwahati 11 .Semantic Analysis • Collects context (semantic) information from syntax tree and symbol table and checks for semantic consistency with language definition. x. Semantic errors: • • • Type mismatches. x = ary * 20. • • Type checkers in the semantic analyser may also do automatic type conversions if permitted by the language specification (Coercions).

3> float float inttofloat 60 10-Jan-13 CSE346: Compilers.1> float float + <id.Semantic Analysis = <id. IIT Guwahati 12 .2> float float * <id.

2> <id. IIT Guwahati • • = <id.3> inttofloat 60 13 + * .Intermediate Code Generation • Translate language-specific constructs in the AST into more general constructs. A criterion for the level of “generality”: it should be straightforward to generate the target code from the intermediate representation chosen. Example of a form of IR (3-address code): t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3 10-Jan-13 CSE346: Compilers.1> <id.

Eg..0 id1 = id2 + t1 • Our Example: 10-Jan-13 CSE346: Compilers. t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3 t1 = id3 * 60. A range of optimization techniques may be applied. • • • • • Removing unused variables Suppressing generation of unreachable code segments Constant Folding Common Sub-expression Elimination Loop optimization (Removing unmodified statements in a loop).. etc. IIT Guwahati 14 .Code Optimization • • Generates streamlined code still in intermediate representation.

Target Code Generation • Map 3-address code into assembly code of the target architecture • Instruction selection: A pattern matching problem.0 id1 = id2 + t1 MOVF MULF MOVF ADDF MOVF id3. • Our Example — A possible translation using two registers: t1 = id3 * 60. R1 R2. R1 R1. R2 #60.0. R2 id2. • Register allocation: Each value should be in a register when it is used (but there is only a limited number). IIT Guwahati 15 . id1 10-Jan-13 CSE346: Compilers. • Instruction scheduling: take advantage of multiple functional units.

Next Lecture Design of the Front-End of a Very Simple Compiler 10-Jan-13 CSE346: Compilers. IIT Guwahati 16 .