Compiler Design File

SCHOOL OF INFORMATION AND
COMMUNICATION TECHNOLOGY
COMPILER DESIGN LAB

AI 383
NAME- ANUSHKA SRIVASTAVA

ROLL NO- 215/UAI/031
BRANCH- B.TECH AI
SEM-5th
INDEX
S.no Program Date Signature

1. Practice of LEX/YACC of Compiler writing.
2. Write a program to check whether a string belongs

to grammar or not.
3. Write a program to a generate a parse tree.
4. Write a program to find leading terminals.
5. Write a program to find trailing terminals.
6. Write a program to compute FIRST of

non-terminals.
7. Write a program to compute FOLLOW of

non-terminals.
1
1. Practice of LEX/YACC of Compiler writing.
Introduction-
Some of the most time consuming and tedious parts of writing a compiler involve the
lexical scanning and syntax analysis. Luckily there is freely available software to assist in
these functions. While they will not do everything for you, they will enable faster
implementation of the basic functions. Lex and Yacc are the most commonly used
packages with Lex managing the token recognition and Yacc handling the syntax. They
work well together, but conceivably can be used individually as well.
Both operate in a similar manner in which instructions for token recognition or grammar
are written in a special file format. The text files are then read by lex and/or yacc to
produce c code. This resulting source code is compiled to make the final application. In
practice the lexical instruction file has a “.l” suffix and the grammar file has a “.y” suffix.
LEX-
The file format for a lex file consists of (4) basic sections
•The first is an area for c code that will be place verbatim at the beginning of the generated
source code. Typically is will be used for things like #include, #defines, and variable
declarations.
• The next section is for definitions of token types to be recognized. These are not
mandatory, but in general makes the next section easier to read and shorter.
• The third section set the pattern for each token that is to be recognized, and can also
include c code to be called when that token is identified
• The last section is for more c code (generally subroutines) that will be appended to the
end of the generated c code. This would typically include a main function if lex is to be
used by itself.
• The format is applied as follows (the use and placement of the % symbols are
necessary):
%{
//header c code
%}
//definitions
%%
//rules
%%
//subroutines
YACC-
The format for a yacc file is similar, but includes a few extras.
• The first area (preceded by a %token) is a list of terminal symbols. You do not need to
list single character ASCII symbols, but anything else including multiple ASCII symbols
need to be in this list (i.e. “==”).
• The next is an area for c code that will be place verbatim at the beginning of the
generated source code. Typically is will be used for things like #include, #defines, and
variable declarations.
• The next section is for definitions - none of the following examples utilize this area
• The fourth section set the pattern for each token that is to be recognized, and can also
include c code to be called when that token is identified
• The last section is for more c code (generally subroutines) that will be appended to the
end of the generated c code. This would typically include a main function if lex is to be
used by itself.
• The format is applied as follows (the use and placement of the % symbols are
necessary):
2
%tokens RESERVED, WORDS, GO, HERE

%{
//header c code
%}
//definitions
%%
//rules
%%
//subroutines
These formats and general usage will be covered in greater detail in the following (4)
sections. In general it is best not to modify the resulting c code as it is overwritten each
time lex or yacc is run. Most desired functionality can be handled within the lexical and
grammar files, but there are some things that are difficult to achieve that may require
editing of the c file.
As a side note, the functionality of these programs has been duplicated by the GNU open
source projects Flex and Bison. These can be used interchangeably with Lex and Yacc for
everything this document will cover and most other uses as well.
3
2. Write a program to check whether a string belongs to grammar or not.
def is_valid_string(input_string):
stack = []
for char in input_string:
if char == 'a':
stack.append('a')
elif char == 'b':
if not stack:
return False # 'b' without corresponding 'a'
stack.pop()
else:
return False # Invalid character
return not stack # True if stack is empty (equal 'a' and 'b' counts)
# Test the program

input_string = input("Enter a string: ")
if is_valid_string(input_string):
print("The string belongs to the grammar.")
else:
print("The string does not belong to the grammar.")
Output-
4
3. Write a program to generate a parse tree.
class Node:
def __init__(self, value):
self.value = value
self.children = []
def generate_parse_tree(expression):
tokens = expression.replace(" ", "") # Remove spaces
root = Node("Expression")
current_node = root
stack = [root]
for token in tokens:

if token.isdigit():
current_node.children.append(Node("Number: " + token))
elif token in ['+', '*']:
current_node.children.append(Node("Operator: " + token))
new_node = Node("Expression")
current_node.children.append(new_node)
stack.append(new_node)
current_node = new_node
elif token == '(':
new_node = Node("Expression")
current_node.children.append(new_node)
stack.append(new_node)
current_node = new_node
elif token == ')':
stack.pop()
if stack:
current_node = stack[-1]
return root
def print_parse_tree(node, depth=0):

if node is not None:
print(" " * depth + node.value)
for child in node.children:
print_parse_tree(child, depth + 1)
# Example usage:
expression = "3 + 4 * (5 + 2)"
parse_tree = generate_parse_tree(expression)
print_parse_tree(parse_tree)
5
OUTPUT-
6
4. Write a program to find leading terminals.
def find_leading_terminals(grammar):
leading_terminals = {}
for non_terminal in grammar:

leading_terminals[non_terminal] = set()
for non_terminal, productions in grammar.items():

for production in productions:
first_symbol = production[0]
if first_symbol.islower(): # Terminal symbol
leading_terminals[non_terminal].add(first_symbol)
elif first_symbol in grammar: # Non-terminal symbol
leading_terminals[non_terminal] |=
leading_terminals[first_symbol]
return leading_terminals
# Example usage:
# Grammar: S -> Ab | Bc | d, A -> a, B -> b
grammar = {
'S': ['Ab', 'Bc', 'd'],
'A': ['a'],
'B': ['b']
}
leading_terminals = find_leading_terminals(grammar)
print("Leading Terminals:")
for non_terminal, terminals in leading_terminals.items():
print(f"{non_terminal}: {terminals}")
OUTPUT-
7
5. Write a program to find trailing terminals
def find_trailing_terminals(grammar):
trailing_terminals = {}

trailing_terminals[non_terminal] = set()

last_symbol = production[-1]
if last_symbol.islower(): # Terminal symbol
trailing_terminals[non_terminal].add(last_symbol)
elif last_symbol in grammar: # Non-terminal symbol
trailing_terminals[non_terminal] |=
trailing_terminals[last_symbol]
return trailing_terminals
# Example usage:
# Grammar: S -> aA | bB | c, A -> d, B -> e
grammar = {
'S': ['aA', 'bB', 'c'],
'A': ['d'],
'B': ['e']
}
trailing_terminals = find_trailing_terminals(grammar)
print("Trailing Terminals:")
for non_terminal, terminals in trailing_terminals.items():
print(f"{non_terminal}: {terminals}")
OUTPUT-
8
6. Write a program to compute FIRST of non-terminals
def compute_first_sets(grammar):
first_sets = {}

first_sets[non_terminal] = set()

first_sets[non_terminal].add(first_symbol)
first_sets[non_terminal] |=
compute_first_sets_helper(grammar, first_symbol)
return first_sets
def compute_first_sets_helper(grammar, symbol):

first_set = set()
for production in grammar[symbol]:

first_set.add(first_symbol)
first_set |= compute_first_sets_helper(grammar,
first_symbol)
return first_set
# Example usage:
grammar = {
'S': ['Ab’, 'Bc', 'd'],
'A': ['a'],
'B': ['b']
}
first_sets = compute_first_sets(grammar)
print("FIRST Sets:")
for non_terminal, first_set in first_sets.items():
print(f"{non_terminal}: {first_set}")
9
OUTPUT-
10
7. Write a program to compute the FOLLOW of non-terminals.
def compute_follow_sets(grammar, start_symbol):

follow_sets = {non_terminal: set() for non_terminal in grammar}
# Add '$' (end-of-input marker) to the follow set of the start

symbol
follow_sets[start_symbol].add('$')
while True:
prev_follow_sets = {non_terminal: set(follow_set) for
non_terminal, follow_set in follow_sets.items()}

for i, symbol in enumerate(production):
if symbol in grammar:
remaining_symbols = production[i + 1:]
first_of_remaining =
compute_first_of_string(grammar, remaining_symbols)
if '' in first_of_remaining:
first_of_remaining.remove('') # Remove
epsilon
follow_sets[symbol] |= first_of_remaining
if '' in first_of_remaining or all(x in

first_of_remaining for x in first_of_remaining):
follow_sets[symbol] |=
follow_sets[non_terminal]
# Check if follow sets have converged

if follow_sets == prev_follow_sets:
break
return follow_sets
def compute_first_of_string(grammar, symbols):

first_set = set()
for symbol in symbols:

if symbol.islower(): # Terminal symbol
first_set.add(symbol)
break
elif symbol in grammar: # Non-terminal symbol
first_set |= compute_first_of_string_helper(grammar, symbol)
if '' not in first_set:
break
11
else:
break
return first_set
def compute_first_of_string_helper(grammar, symbol):

first_set = set()
for production in grammar[symbol]:

first_set.add(first_symbol)
first_set |= compute_first_of_string_helper(grammar,
first_symbol)
return first_set
# Example usage:
grammar = {
'S': ['Ab', 'Bc', 'd'],
'A': ['a'],
'B': ['b']
}
start_symbol = 'S'
follow_sets = compute_follow_sets(grammar, start_symbol)
print("FOLLOW Sets:")
for non_terminal, follow_set in follow_sets.items():
print(f"{non_terminal}: {follow_set}")
OUTPUT-

Compiler Design File

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compiler Design File

Uploaded by

Copyright:

Available Formats

SCHOOL OF INFORMATION AND

COMPILER DESIGN LAB

NAME- ANUSHKA SRIVASTAVA

S.no Program Date Signature

2. Write a program to check whether a string belongs

3. Write a program to a generate a parse tree.

4. Write a program to find leading terminals.

5. Write a program to find trailing terminals.

6. Write a program to compute FIRST of

7. Write a program to compute FOLLOW of

1. Practice of LEX/YACC of Compiler writing.

%tokens RESERVED, WORDS, GO, HERE

2. Write a program to check whether a string belongs to grammar or not.

# Test the program

3. Write a program to generate a parse tree.

for token in tokens:

def print_parse_tree(node, depth=0):

4. Write a program to find leading terminals.

for non_terminal in grammar:

for non_terminal, productions in grammar.items():

5. Write a program to find trailing terminals

for non_terminal in grammar:

for non_terminal, productions in grammar.items():

6. Write a program to compute FIRST of non-terminals

for non_terminal in grammar:

for non_terminal, productions in grammar.items():

def compute_first_sets_helper(grammar, symbol):

for production in grammar[symbol]:

7. Write a program to compute the FOLLOW of non-terminals.

def compute_follow_sets(grammar, start_symbol):

# Add '$' (end-of-input marker) to the follow set of the start

for non_terminal, productions in grammar.items():

if '' in first_of_remaining or all(x in

# Check if follow sets have converged

def compute_first_of_string(grammar, symbols):

for symbol in symbols:

def compute_first_of_string_helper(grammar, symbol):

for production in grammar[symbol]:

follow_sets = compute_follow_sets(grammar, start_symbol)

You might also like