You are on page 1of 16

Lab Manual

System Programming & Compiler Construction

Experiment No. 1
Aim :- Design and implementation of pass1 of two-pass assembler for general

machine. Theory :- If the source program is in assembly language of the machine, the object code is in the machine language of same machine and translation is executing on the same machine. Assembler generates two files. One is object file with .obj extension and the other is list file with .lst extension. The object file contains the binary code for instruction and information about object instruction. The list file contains assembly language statements, binary code for each instruction and offset for each instruction. There are two phases of assembler design : 1. Analysis phase 2. Synthesis phase These phases performs following tasks : 1. Analysis phase : a) b) c) d) e) f) g) Isolate label, mnemonic opcode, operands and comment of statement. Check validity of mnemonic opcode by consulting MOT. Check the no. of operands required for an instruction by consulting MOT. Process the labels or symbols appropriately and fill ST(Symbol Table). Update LC appropriately by consulting MOT for the length of instruction. Ignore comments. Take proper actions for pseudo opcodes by consulting POT.

2. Synthesis phase : a) Obtain machine opcode consulting to MOT corresponding to mnemonic. b) Filll in address of symbols or labels by consulting ST. c) Write the above information in output object file. Flowchart :- Draw flowchart for PASS1 Conclusion :- Assembler is implemented successfully for given instruction set and object code is generated for the same.

Lab Manual

System Programming & Compiler Construction

Experiment No. 2
Aim :- Design and implementation of pass2 of two-pass assembler for general

machine. Theory :Describe pass2 objective. Explain data structures used in pass2. Flowchart :- Draw flowchart for PASS2

Lab Manual

System Programming & Compiler Construction

Experiment No. 3
Aim :- Design and Implementation of Macro Processor.

Theory :- Macro is a set of instructions containing macro definition & macro call. Macro definition contains three parts: y A macro prototype statement y Macro body or model statements y Macro preprocessor statements Macro processor is used to execute macro. Macro processor converts assembly or HLL program into another assembly or HLL without traces of macro. When macro processor come across a macro call then it replaces each macro call with corresponding group of instructions from the macro body. When it comes across the macro definition, it records the information about the macro such as macro name, formal parameters & macro body statements into various tables. MDT (Macro Definition Table) : It contains body of the macro & index for each instruction. MNT (Macro Name Table) : It contains name of the macro, number of parameters, starting row in MDT & ending row in MDT. Macro processor can be implemented in 2 passes. y Pass I Input to this pass is source program. y Pass II Input is intermediate code & various tables formed in pass I. Databases used in Pass I are: 1) Input source file. 2) Intermediate file. 3) MDT 4) MNT 5) MDTC macro definition table counter used to indicate next available entry in MDT. 6) MNTC - macro name table counter used to indicate next available entry in MNT. 7) ALA (Argument List Array) - used to substitute index marks for the formal parameter before storing the macro definition. Databases used in Pass II are: 1) Intermediate file. 2) Output file. 3) MDT 4) MNT 5) MDTCP macro definition table pointer used to correspond macro definition stored in MDT or used to indicate next line of text to be used. Initial value of MDTP is the index of MDT in MNT table. 6) ALA (Argument List Array) - used to substitute macro call argument for the index marked in the stored macro definition. Algorithm :1. Start. 2. Prepare and initialize all data structures like MDT,MNT,ALA. 3. Open input file containing macro definition & call in read mode. 3

Lab Manual

System Programming & Compiler Construction

4. Open intermediate file in write mode. 5. Read a line from input file in a buffer & separate the tokens. 6. If the buffer contains the word MACRO then its a macro definition so process it with following sequence: i. Read next line into a buffer, separate tokens. ii. Check whether first token is in MNT if not, store it at position pointed by MNTC. iii. Else display error that duplicate macro definition. iv. Store the formal parameters separated by comma into ALA with macro name. Store no. of parameters in MNT. v. Read all the remaining line until MEND comes & store them in MDT pointed by position MDTC. Replace all parameter names by its corresponding index in ALA. Store Starting MDTC value in MNT. 7. If no, check whether it is present in MNT. If yes, process the macro call as: i. Check its entry in MNT, no. of parameters also. ii. Read the next token & store the actual parameters in ALA. iii. Check the value of MDTC on MNT & initialize MDTP. iv. Read the line in a buffer & replace the index no. by actual parameter name. v. Write the buffer in output file. vi. Repeat steps (iv) & (v) until MEND is encountered in MDT. 8. If the tokens are not present in MNT or MDT, then write the line as it is in intermediate file. 9. Repeat steps (5) & (8) until the end of input file. Conclusion :- Use of macro avoids repetition of instruction in the program. Expanded code gives the series of instructions which must be executed sequentially.

Lab Manual

System Programming & Compiler Construction

Experiment No. 4
Aim :- Design a Lexical analyzer for a language whose grammar is known.

Theory :- Lexical analyzer is the first interface between the source program and the compiler. It reads the source program character by character from a file and recognizes the logically cohensive units , so called as lexemes or rokens. The tokens could be keywords, identifiers, operators, labels, punctuations, symbols or digits etc. Activity of Lexical analyzer: Input: Source program in HLL Output: Stream of tokens. Many a times it needs to look ahead to confirm a token. e.g. Do 5I=1.25 (in Fortran) This is not a Do statement; however it is a valid assignment statement. In Fortran blanks are allowed in identifier and therefore Do 5Iis a valid identifier in Fortran, provided 1.25 is there and not 1,25. If 1,25 is there, then it is a valid Do statement. So in such case, several characters needs to be scanned to take a right decision. In C such is the case with + & ++ operators. Algorithm :1. Start. 2. Initialize a character array storing all keywords from C. 3. Open source file in read mode. 4. Read one character at the current position of the file pointer. 5. Store the character in a character array if it is not already present. 6. Display the character if it is a symbol. 7. Repeat these steps till end of the file. 8. Rewind the file and initialize file pointer to the start of the file. 9. Read a character at file pointer position. 10. Check its ASCII value, if it falls within the range of character, add it to the array of character string. 11. If char is symbol, check string whether its a keyword. If it is, check if already present. If not, add it to temporary array & display it with its priority. 12. If character is a punctuation, store in a separate array & display with its priority. 13. Repeat from steps (9), till end of the file. 14. Display all arrays i.e. symbol table, literal table, Terminal table. 15. Stop. Conclusion :- Lexical analyzer is implemented successfully for a subset of C.

Lab Manual

System Programming & Compiler Construction

Experiment No. 5
Aim :- Design and Implementation of simple Parser using Lex Yacc.

Theory :- A compiler or interpreter for a programming language is often decomposed into two parts: 1. Read the source program and discover its structure. 2. Process this structure, e.g. to generate the target program. Lex and Yacc can generate program fragments that solve the first task. The task of discovering the source structure again is decomposed into subtasks: 1. Split the source file into tokens (Lex). 2. Find the hierarchical structure of the program (Yacc). Lex - A Lexical Analyzer Generator Lex helps write programs whose control flow is directed by instances of regular expressions in the input stream. It is well suited for editor-script type transformations and for segmenting input in preparation for a parsing routine. Lex source is a table of regular expressions and corresponding program fragments. The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. As each such string is recognized the corresponding program fragment is executed. The recognition of the expressions is performed by a deterministic finite automaton generated by Lex. The program fragments written by the user are executed in the order in which the corresponding regular expressions occur in the input stream. Lex source Program lex.l lex.yy.c Lex Compiler

lex.yy.c

C Compiler

a.out

input stream

a.out

sequence of tokens

Yacc: Yet Another Compiler-Compiler Computer program input generally has some structure; in fact, every computer program that does input can be thought of as defining an ``input language'' which it accepts. An input language may be as complex as a programming language, or as simple as a sequence of numbers. Unfortunately, usual input facilities are limited, difficult to use, and often are lax about checking their inputs for validity.

Lab Manual

System Programming & Compiler Construction

Yacc provides a general tool for describing the input to a computer program. The Yacc user specifies the structures of his input, together with code to be invoked as each such structure is recognized. Yacc turns such a specification into a subroutine that han- dles the input process; frequently, it is convenient and appropriate to have most of the flow of control in the user's application handled by this subroutine. Yacc Specification Translate.y y.tab.c

Yacc compiler C Compiler a.out

y.tab.c

a.out

input

ouput

A single pass compiler can be built using LEX & YACC if the semantic action directly generate the target code instead of IR. Output of LEX can be given as an input to YACC to generate target code. Lex specification LEX Scanner

Syntax specification Algorithm :-

YACC

Parser

1. Start. 2. Write lex specifications and actions for arithmetic expression evaluation and save in a file with .lex extension. 3. Write syntax specifications for arithmetic expressions in a file & save it with .y extension. 4. Compile the lex file using lex tool to get lex.yy.c 5. Compile yacc file using yacc tool to get y.tab.c 6. Execute the files lex.yy.c & y.tab.c with C compiler to get a.out file. 7. Execute a.out to get final output. 8. Stop. Conclusion :- Parser has been successfully implemented using Lex Yacc tools such as Flex & BYacc.

Lab Manual

System Programming & Compiler Construction

Experiment No. 6
Aim :- Implementation of code optimization techniques.

Theory :- The code produced by straightforward compiling algorithms can be often made to run faster or take less space or both. This improvement is achieved by program transformations that are called as code optimizations. Optimizer tries to : y Eliminate overhead from language abstractions y Map source program onto hardware efficiently y Hide hardware weaknesses, utilize hardware strengths y Equal the efficiency of a good assembly programmer Code optimization techniques are generally applied after syntax analysis, usually both before and during code generation. A transformation of a program is called local, if it can be performed by looking only at the statements in a basic block, otherwise it is called global. Many transformations can be performed at both the local & global levels. Techniques for implementing these transformations are as follows: y Function preserving transformations (local optimization) y Loop optimization Function preserving transformations (local optimization) There are a number of ways in which a compiler can improve a program without changing the function. Some examples of Function preserving transformations are : 1. Common sub expression elimination implies the elimination of an expression evaluation from a place in the program if an equivalent value has already been computed & can be used. 2. Copy propagation one copy of a variable or expression if repeated then can be used multiple times if the value remains unchanged. 3. Dead code elimination A piece of code is said to be dead, if the results of evaluating the code are not used anywhere in the program. Such code can be eliminated safely. 4. Constant folding If the value of an expression is a constant at compile time & using the constant is known as constant folding. Operators in the source program, whose operands are constants, can be evaluated at compile time itself. This is called folding of the operation. Loop optimization The execution time of a program can be improved if the no. of instructions in inner loops is decreased. Some techniques of loop optimization are : 1. Code motion It seeks to improve execution time of a program by moving the evaluation of an expression to other parts of program. Frequently occurring statements or expressions are reduced. 2. Induction variable elimination this focuses on of identifying variables that are incremented by fixed amounts with each iteration. This includes loop control variables & other variables that depend on loop control variables in fixed ways. 3. Reduction in strength an expensive operation can be replaced by a cheaper one. 4. Loop unrolling involves replacing body of loop to reduce the no. of tests required to be carried out if the no. of iterations are constant.

Lab Manual

System Programming & Compiler Construction

Algorithm :1. 2. 3. 4. 5. 6. 7. Start. Accept source program in the form of three address code. Check for the dead code in the source program. Remove the dead code & reduce code. Check for common sub expressions in the source program. Eliminate those expressions. Stop.

Conclusion :- Dead code elimination and Common sub expression elimination techniques have been implemented successfully.

Lab Manual

System Programming & Compiler Construction

Experiment No. 7
Aim :- Generate target code for the code optimized, considering the target machine to be x86. Theory :- Code generation is one of the least formalized subjects in compiler construction. The task of the code generation phase of a compiler is, to take as input a given internal form representation of the source program & to produce as output an equivalent sequence of instructions in the language of the object machine. Purpose of this phase is to produce the appropriate code either in assembly or machine language. This phase has parse tree as input, it uses code productions or rules. These productions define the operators we encounter in the parse trees. This phase tries to transform intermediate code into a form from which more efficient target code can be produced.

Source Program

Front end Analysis activities

IC

Code Optimiza -tion

IC

Code generator

Target program

Symbol table Literal table & Other tables

Algorithm :1. Start. 2. Open source file containing optimized source code in the form of three address code, in read mode. 3. Store the instruction set in assembly language for each corresponding instruction in three address code operation. 4. Read the input file line by line. 5. Replace each line of three address code instruction by corresponding set of assembly language instruction set. 6. Repeat the steps (4) & (5) until the end of file. 7. Stop. Conclusion :- Target code in x86 assembly language is generated for the code optimized.

10

Lab Manual

System Programming & Compiler Construction

Experiment No. 8
Aim :- Study of Different Debugger Tools.

Theory :- A debugger or debugging tool is a computer program that is used to test and debug other programs (the "target" program). When the program "crashes" or reaches a preset condition, the debugger typically shows the position in the original code if it is a source-level debugger or symbolic debugger, commonly now seen in integrated development environments. If it is a low-level debugger or a machine-language debugger it shows the line in the disassembly. Typically, debuggers also offer more sophisticated functions such as running a program step by step (single-stepping), stopping (breaking at some event or specified instruction by means of a breakpoint, and tracking the values of some variables. Some debuggers have the ability to modify the state of the program while it is running, rather than merely to observe it. It may also be possible to continue execution at a different location in the program to bypass a crash or logical error. List of debuggers: 1. Turbo Debugger Turbo Debugger was a machine-level debugger for MS-DOS executables sold by Borland. This tool provided a full-screen debugger with powerful capabilities for watching the execution of instructions, monitoring machine registers, etc. Later versions are able to step through source code compiled with Borland compilers set to provide debugging information. The original Turbo Debugger was a stand-alone product introduced in 1989, along with Turbo Assembler and the second version of Turbo C. Later all three of these products were included in the Borland C++ and Borland Pascal suite of products for MS-DOS. These suites, aimed at professional software developers, merged the Turbo C or Borland Pascal IDE with several other tools such as a debugger, stand-alone assembler, profiler, etc. The current debuggers in products such as C++ Builder and Delphi are based on the Windows debugger introduced with the first Borland C++ and Pascal versions for Windows. The final version of Turbo Debugger came with several versions of the debugger program: TD.EXE was the basic debugger; TD286.EXE in protected mode, and TD386.EXE was a virtual debugger which used the TDH386.SYS device driver to communicate with TD.EXE. The TDH386.SYS driver also added breakpoints supported in hardware by the 386 and later processors to all three debugger programs. The only real difference between TD386 and the other two debuggers was that TD386 allowed some extra breakpoints that the other debuggers did not (I/O access breaks, ranges greater than 16 bytes, and so on). There was also a debugger for Windows 3 (TDW.EXE). Remote debugging was supported. 2. Microsoft Visual Studio Debugger The Microsoft Visual Studio Debugger is a debugger that ships along with all versions of Microsoft Visual Studio. This debugger owes much of its feel and 11

Lab Manual

System Programming & Compiler Construction

functionality to CodeView, a standalone, text-based debugger that shipped with Microsoft Visual C++ version 1.5 and earlier. More advanced features of the most recent versions of this debugger include:
y y y y y y y y

y y y

Full symbol and source integration. Attaching and detaching to and from processes. Integrated debugging across programs written in both .NET and native Windows languages. Remote machine debugging. Full support for C++, including templates and the standard library Debugging ASP.NET Web Services. Standard as well as more advanced breakpoint features, including conditional, address, data breakpoints. Many ways of viewing program state and data, including multiple watch windows, threads, call stack, and modules. The way library and user data types are displayed can be configured. Scriptability or the ability to control via a macro or scripting language. Any language which can talk to COM can be used. Edit and continue support, enabling source code change and recompilation without having to restart the program (32 bit applications only). Local and remote debugging of SQL stored procedures on supported versions of Microsoft SQL Server.

The main shortcoming of the Visual Studio Debugger is its inability to trace into kernel-mode code. However, this is possible using a free VisualDDK extension. Alternatively, kernel-mode debugging of Windows is generally performed by using WinDbg, KD, or SoftICE. 3. Java Platform Debugger Architecture The Java Platform Debugger Architecture is a collection of APIs to debug Java code.
y y y y

Java Debugger Interface (JDI) - defines a high-level Java language interface which developers can easily use to write remote debugger application tools. Java Virtual Machine Debug Interface (JVMDI)- JVMDI was deprecated in J2SE 5.0 in favor of JVM TI, and was removed in Java SE 6. Java Debug Wire Protocol (JDWP) - defines communication between debuggee (a Java application) and debugger processes. The Java Virtual Machine Tools Interface, a native interface which helps to inspect the state and to control the execution of applications running in the Java Virtual Machine (JVM).

4. IBM Rational Purify Purify is a memory debugger program used by software developers to detect memory access errors in programs, especially those written in C or C++. It was originally written by Reed Hastings of Pure Software Purify allows dynamic verification, a process by which a program discovers errors that occur when the program runs, much like a debugger. Static verification or static code analysis, by contrast, involves detecting errors in the source code without ever compiling or running it, just by

12

Lab Manual

System Programming & Compiler Construction

discovering logical inconsistencies. The type checking by a C compiler is an example of static verification. When a program is linked with Purify, corrected verification code is automatically inserted into the executable by parsing and adding to the object code, including libraries. That way, if a memory error occurs, the program will print out the exact location of the error, the memory address involved, and other relevant information. Purify also detects memory leaks. By default, a leak report is generated at program exit but can also be generated by calling the Purify leak-detection API from within an instrumented application. The errors that Purify discovers include array bounds reads and writes, trying to access unallocated memory, freeing unallocated memory, as well as memory leaks. It is essential to note that most of these errors are not fatal, and often when just running the program there is no way to detect them, except by observing that something is wrong due to incorrect program behavior. Hence Purify helps enormously by detecting these errors and telling the programmer exactly where they occur. Because Purify works by instrumenting all the object code, it detects errors that occur inside of third-party or operating system libraries. These errors are often caused by the programmer passing incorrect arguments to the library calls, or by misunderstandings about the protocols for freeing data structures used by the libraries. These are often the most difficult errors to find and fix.

13

Lab Manual

System Programming & Compiler Construction

Experiment No. 9
Aim :- Implement Shift Reduce Parser.

Theory :- A simple bottom-up parser, known as Shift-reduce parser can be implemented using a stack to hold a sequence of terminal & non-terminal symbols. Symbols from the input string can be shifted onto this stack, or the items on the stack can be reduced by applying a grammar rule, such that the right-hand side of the rule matches symbols on the top of the stack. This is a bottom-up parser, starting from the input sequence, and making reductions, we aim to end up with the goal symbol. The reduction of a sequential form is achieved by substituting the left side of a production for a string which matches the right side, rather than by substituting the right side of a production whose left side appears as a non-terminal in the sentential form. A bottom-up parsing algorithm might employ a parse stack, which contains possible sentential form of terminals and/or non-terminals. As we read each terminal from the input string we push it on the parse stack, and then examine the top elements of this to see whether we can make a reduction. Some terminals may remain on the parse stack quite a long time before they are finally pushed off and discarded. Simulation of shift-reduce parser for Grammar rules : E p E+E | E*E | (E) | id Input string : Stack $ $id $E $E+ $E+id $E+E $E+E* $E+E*id $E+E*E $E+E $E id + id * id Input id + id * id $ + id * id $ + id * id $ id * id $ * id $ * id $ id $ $ $ $ $ Action shift reduce by E p id shift shift reduce by E p id shift shift reduce by E p id reduce by E p E*E reduce by E p E+E accept

14

Lab Manual

System Programming & Compiler Construction

Algorithm :1) 2) 3) 4) 5) 6) 7) 8) Accept grammar rules from user. Accept the string to be parsed by user. Read the string character by character and identify a token. Check which of the grammar rules matches with the token on the right hand side of the rule. Replace the token by the non-terminal at the left hand side of the rule. If not end of the string then go to step (3). If the string end has been reached & there is only one starting non-terminal after reduction, then the string is accepted. Otherwise the string is rejected as it is not a valid string generated by the given grammar.

Conclusion :- Shift reduce parser for given set of grammar rules has been implemented using stack.

15

Lab Manual

System Programming & Compiler Construction

Experiment No. 10
Aim :- Create a Dynamic Link Library using VB6.0

Theory :-

16

You might also like