You are on page 1of 6

Name : OMKAR JOSHI

Roll No : BE20S05F004
Sub : SPCC

Assignment No. 1
Q1] Differentiate between compilation & interpretation
Ans:
Compilation Interpretation

● No translation cost at runtime: efficient ● Excellent debugging facilities: source code


Execution. known when error occurs.
● Translation cost amortized over many ● Excellent checking: both input and source
runs. are known.
● Can distribute program without revealing ● Easy to implement.
either source or compiler (commercial ● Quick feedback due to rapid edit-test cycle.
software distribution). ● Can be embedded into other applications
● Extensive (and slow) optimizations (for scripting purposes).
possible. ● Can generate and evaluate new code at
runtime (eval).

➡ Programming language can be


➡ Resulting program executes much more flexible.
much faster than if interpreted. ➡ Can be portable.
➡ Requires code generation and ➡ Inefficient.
detailed platform knowledge.

Q2] Explain different compiler data structure


Ans:
Token :
● Represented by an integer value or an enumeration literal
● Sometimes, it is necessary to preserve the string of characters that was scanned
● For example, name of an identifiers or value of a literal
Syntax Tree :
● Constructed as a pointer-based structure
● Dynamically allocated as parsing proceeds
● Nodes have fields containing information collected by the parser and semantic analyzer
Symbol Table ;
● Keeps information associate with all kinds of identifiers:
Constants, variables, functions, parameters, types, fields, etc. :
● Identifiers are entered by the scanner, parser or semantic analyzer
● Semantic analyzer adds type information and other attributes
● Code generation and optimization phases use the information in the symbol table
● Insertion, deletion and search operation needed to efficient because they are frequent
● Hash table with constant time operations is usually the preferred choice
● More than one symbol table may be used
Literal Table :
● Stores constant values and string literal in a program
● One literal table applies globally to the entire program
● Used by the code generator to:
Assign addresses for literals
Enter data definitions in the target code file
● Avoids the replication of constants and strings
● Quick insertion and lookup are essential. Deletion is not necessary
Temporary Files :
● Used historically by old compilers due to memory constraints
● Hold the data of various stages

Q3] What is specification of tokens


Ans: There are 3 specifications of tokens:
1)Strings
2) Language
3)Regular expression

Strings and Languages


v An alphabet or character class is a finite set of symbols.
v A string over an alphabet is a finite sequence of symbols drawn from that alphabet.
v A language is any countable set of strings over some fixed alphabet.
In language theory, the terms "sentence" and "word" are often used as synonyms for

"string." The length of a string s, usually written |s|, is the number of occurrences of symbols in s. For
example, banana is a string of length six. The empty string, denoted ε, is the string of length zero.

Operations on strings
The following string-related terms are commonly used:

1. A prefix of string s is any string obtained by removing zero or more symbols from the end of
string s. For example, ban is a prefix of banana.
2. A suffix of string s is any string obtained by removing zero or more symbols from the
beginning of s. For example, nana is a suffix of banana.
3. A substring of s is obtained by deleting any prefix and any suffix from s. For example, nan is
a substring of banana.
4. The proper prefixes, suffixes, and substrings of a string s are those prefixes, suffixes, and
substrings, respectively of s that are not ε or not equal to s itself.
5. A subsequence of s is any string formed by deleting zero or more not necessarily
consecutive positions of s
6. For example, baan is a subsequence of banana.
Operations on languages:
The following are the operations that can be applied to languages:
1. Union
2. Concatenation
3. Kleene closure
4. Positive closure

The following example shows the operations on strings: Let L={0,1} and S={a,b,c}

Regular Expressions
● Each regular expression r denotes a language L(r).
● Here are the rules that define the regular expressions over some alphabet Σ and the
languages that those expressions denote:

1.ε is a regular expression, and L(ε) is { ε }, that is, the language whose sole member is the empty
string.
2. If ‘a’ is a symbol in Σ, then ‘a’ is a regular expression, and L(a) = {a}, that is, the language with one
string, of length one, with ‘a’ in its one position.
3.Suppose r and s are regular expressions denoting the languages L(r) and L(s). Then, a) (r)|(s) is a
regular expression denoting the language L(r) U L(s).

b) (r)(s) is a regular expression denoting the language L(r)L(s). c) (r)* is a regular expression
denoting (L(r))*.
d) (r) is a regular expression denoting L(r).
4.The unary operator * has highest precedence and is left associative.
5.Concatenation has second highest precedence and is left associative.
6. | has lowest precedence and is left associative.

Regular set
A language that can be defined by a regular expression is called a regular set. If two regular
expressions r and s denote the same regular set, we say they are equivalent and write r = s.

There are a number of algebraic laws for regular expressions that can be used to manipulate into
equivalent forms.
For instance, r|s = s|r is commutative; r|(s|t)=(r|s)|t is associative.

Regular Definitions
Giving names to regular expressions is referred to as a Regular definition. If Σ is an alphabet of basic
symbols, then a regular definition is a sequence of definitions of the form
dl → r 1
d2 → r2
………
dn → rn
1.Each di is a distinct name.
2.Each ri is a regular expression over the alphabet Σ U {dl, d2,. . . , di-l}.

Example: Identifiers is the set of strings of letters and digits beginning with a letter. Regular
definition for this set:

letter → A | B | …. | Z | a | b | …. | z | digit → 0 | 1 | …. | 9

id → letter ( letter | digit ) *

Q4] Difference between static & shared libraries


Ans:

properties Static library Shared library

It happens as the last step of Shared libraries are added during


the compilation process. After linking process when executable
Linking time
the program is placed in the file and libraries are added to the
memory memory.

Means Performed by linkers Performed by operating System

Static libraries are much Dynamic libraries are much


bigger in size, because smaller, because there is only one
Size
external programs are built in copy of the dynamic library that is
the executable file. kept in memory.

Executable file will have to be


External file In shared libraries, no need to
recompiled if any changes
changes recompile the executable.
were applied to external files.

Takes longer to execute,


because loading into the It is faster because shared library
Time
memory happens every time code is already in the memory.
while executing.

Compatibility Never has compatibility Programs are dependent on


having a compatible library.
issues, since all code is in Dependent programs will not work
one executable module. if the library gets removed from the
system .

Assignment No. 2
Q1] Explain lex generator
Ans: Lex is a program designed to generate scanners, also known as tokenizers, which recognize
lexical patterns in text. Lex is an acronym that stands for "lexical analyzer generator." It is intended
primarily for Unix-based systems. The code for Lex was originally developed by Eric Schmidt and
Mike Lesk.
Lex can perform simple transformations by itself but its main purpose is to facilitate lexical analysis,
the processing of character sequences such as source code to produce symbol sequences called
tokens for use as input to other programs such as parsers. Lex can be used with a parser generator
to perform lexical analysis. It is easy, for example, to interface Lex and Yacc, an open source
program that generates code for the parser in the C programming language.
Lex is proprietary but versions based on the original code are available as open source. These
include a streamlined version called Flex, an acronym for "fast lexical analyzer generator," as well as
components of OpenSolaris and Plan 9.

Q2] Explain Yacc generator


Ans: Yacc (Yet Another Compiler-Compiler) is a computer program for the Unix operating system
developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right Rightmost Derivation (LALR)
parser generator, generating a LALR parser (the part of a compiler that tries to make syntactic
sense of the source code) based on a formal grammar, written in a notation similar to Backus–Naur
Form (BNF). Yacc is supplied as a standard utility on BSD and AT&T Unix. GNU-based Linux
distributions include Bison, a forward-compatible Yacc replacement.
The input to yacc describes the rules of a grammar. yacc uses these rules to produce the
source code for a program that parses the grammar. You can then compile this source code to
obtain a program that reads input, parses it according to the grammar, and takes action based on
the result.
The source code produced by yacc is written in the C programming language. It consists of a
number of data tables that represent the grammar, plus a C function named yyparse(). By default,
yacc symbol names used begin with yy. This is an historical convention, dating back to yacc's
predecessor, UNIX yacc. You can avoid conflicts with yacc names by avoiding symbols that start
with yy.
If you want to use a different prefix, indicate this with a line of the form:
%prefix prefix
at the beginning of the yacc input. For example:
%prefix ww
asks for a prefix of ww instead of yy. Alternatively, you could specify -p ww on the lex command line.
The prefix chosen should be 1 or 2 characters long; longer prefixes lead to name conflicts on
systems that truncate external names to 6 characters during the loading process. In addition, at least
1 of the characters in the prefix should be a lowercase letter (because yacc uses an all-uppercase
version of the prefix for some special names, and this has to be different from the specified prefix).

You might also like