You are on page 1of 11

Experiment No.

4
Aim: WAP to compute FIRST and FOLLOW of non-terminals.
Theory:
FIRST and FOLLOW sets – a necessary preliminary to constructing the LL(1) parsing table
A predictive parser can only be built for an LL(1) grammar. A grammar is not LL(1) if it is:
1. Left recursive, or
2. Not left factored.
However, grammars that are not left recursive and are left factored may still not be LL(1). To see if a
grammar is LL(1), try to build the parse table for the predictive parser by the method we are about to
describe. If any element in the table contains more than one grammar rule right-hand side, then the grammar
is not LL(1).
To build the table, we must must compute FIRST and FOLLOW sets for the grammar.
FIRST Sets
Ultimately, we want to define FIRST sets for the right-hand sides of each of the grammar’s productions. The
idea is that for a sequence α of symbols, FIRST(α) is the set of terminals that begin the strings derivable from
α, and also, if α can derive ǫ, then ǫ is in FIRST(α).
By allowing ǫ to be in FIRST(α), we can avoid defining the notion of “nullable”. More formally:
FIRST(α) = {t | (t is a terminal and α ⇒∗ tβ) or (t = ǫ and α ⇒∗ ǫ)}
To define FIRST(α) for arbitrary α, we start by defining FIRST(X), for a single symbol X (a terminal,
a nonterminal, or ǫ):
• X is a terminal: FIRST(X) = X
• X is ǫ: FIRST(X) = ǫ
• X is a nonterminal. In this case, we must look at all grammar productions with X on the left, i.e.,
productions of the form:
X → Y1Y2Y3 . . . Yk
where each Yk is a single terminal or nonterminal (or there is just one Y, and it is ǫ). For each such
production, we perform the following actions
– Put FIRST(Y1) − {ǫ} into FIRST(X).
– If ǫ is in FIRST(Y1), then put FIRST(Y2) − {ǫ} into FIRST(X).
– If ǫ is in FIRST(Y2), then put FIRST(Y3) − {ǫ} into FIRST(X).
– If ǫ is in FIRST(Yi) for 1 _ i _ k (all production right- hand sides) then put ǫ into FIRST(X).
For example, compute the FIRST sets of each of the non-terminals in the following grammar:
exp → term exp′
exp′ → − term exp′ | ǫ term → factor term′ term
′ → / factor term′ | ǫ
factor → INTLITERAL | ( exp )
Once we have computed FIRST(X) for each terminal and nonterminal X, we can compute FIRST(α) for
every production’s right-hand-side α. In general, alpha will be of the form:
X1X2 . . .Xn
where each X is a single terminal or nonterminal, or there is just one X1 and it is ǫ.
The rules for computing FIRST(α) are essentially the same as the rules for computing the first set of a
nonterminal.
• Put FIRST(X1) − {ǫ} into FIRST(α)
• If ǫ is in FIRST(X1) put FIRST(X2) − {ǫ} into FIRST(α).
• ...
• If ǫ is in the FIRST set for every Xk, put ǫ into FIRST(α).
For the example grammar above, compute the FIRST sets for each production’s right-hand side.
Why do we care about the FIRST(α) sets? During parsing, suppose the top-of-stack symbol is nonterminal A,
that there are two productions A → α and A → β, and that the current token is a.
Follow Sets
FOLLOW set is a concept used in syntax analysis, specifically in the context of LR parsing algorithms. It is a
set of terminals that can appear immediately after a given non-terminal in a grammar.
The FOLLOW set of a non-terminal A is defined as the set of terminals that can appear immediately after A in
any derivation of the grammar. If A can appear at the right-hand side of a production rule, then the FOLLOW
set of the left-hand side non-terminal of that production rule will be added to the FOLLOW set of A.
Example:
S ->Aa | Ac
A ->b
S S
/ \ / \
A a A c
| |
b b
Here, FOLLOW (A) = {a, c}
Rules to compute FOLLOW set:
FOLLOW(S) = { $ } // where S is the starting Non-Terminal
2) If A -> pBq is a production, where p, B and q are any grammar symbols,
then everything in FIRST(q) except Є is in FOLLOW(B).
3) If A->pB is a production, then everything in FOLLOW(A) is in FOLLOW(B).
4) If A->pBq is a production and FIRST(q) contains Є,
then FOLLOW(B) contains { FIRST(q) – Є } U FOLLOW(A)
Example :
Production Rules:
E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id
FIRST set
FIRST(E) = FIRST(T) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }
FOLLOW Set
FOLLOW(E) = { $ , ) } // Note ')' is there because of 5th rule
FOLLOW(E’) = FOLLOW(E) = { $, ) } // See 1st production rule
FOLLOW(T) = { FIRST(E’) – Є } U FOLLOW(E’) U FOLLOW(E) = { + , $ , ) }
FOLLOW(T’) = FOLLOW(T) = {+,$,)}
FOLLOW(F) = { FIRST(T’) – Є } U FOLLOW(T’) U FOLLOW(T) = { *, +, $, ) }

Program: -
#include <stdio.h>
#include <ctype.h>
#include <string.h>
void followfirst(char, int, int);
void follow(char c);
void findfirst(char, int, int);
int count, n = 0;
char calc_first[10][100];
char calc_follow[10][100];
int m = 0;
char production[10][10];
char f[10], first[10];
int k;
char ck;
int e;
int main(int argc, char **argv)
{
int jm = 0;
int km = 0;
int i, choice;
char c, ch;
count = 4;
strcpy(production[0], "S=aBDh");
strcpy(production[1], "B=cC");
strcpy(production[2], "C=bC ");
strcpy(production[3], "D=EF ");
strcpy(production[4], "E=g ");
strcpy(production[5], "F=f ");
int kay;
char done[count];
int ptr = -1;
for (k = 0; k < count; k++)
{
for (kay = 0; kay < 100; kay++)
{
calc_first[k][kay] = '!';
}
}
printf("\n First for all the variables are: \n ");
int point1 = 0, point2, xxx;
for (k = 0; k < count; k++)
{
c = production[k][0];
point2 = 0;
xxx = 0;
for (kay = 0; kay <= ptr; kay++)
if (c == done[kay])
xxx = 1;
if (xxx == 1)
continue;
findfirst(c, 0, 0);
ptr += 1;
done[ptr] = c;
printf("\n First(%c) = { ", c);
calc_first[point1][point2++] = c;
for (i = 0 + jm; i < n; i++)
{
int lark = 0, chk = 0;
for (lark = 0; lark < point2; lark++)
{
if (first[i] == calc_first[point1][lark])
{
chk = 1;
break;
}
}
if (chk == 0)
{
printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i];
}
}
printf("}\n");
jm = n;
point1++;
}
printf("\n");
printf("-----------------------------------------------\n\n");
char donee[count];
ptr = -1;
printf("\n Follow for all the variables are: \n ");
for (k = 0; k < count; k++)
{
for (kay = 0; kay < 100; kay++)
{
calc_follow[k][kay] = '!';
}
}
point1 = 0;
int land = 0;
for (e = 0; e < count; e++)
{
ck = production[e][0];
point2 = 0;
xxx = 0;
for (kay = 0; kay <= ptr; kay++)
if (ck == donee[kay])
xxx = 1;
if (xxx == 1)
continue;
land += 1;
follow(ck);
ptr += 1;
donee[ptr] = ck;
printf(" Follow(%c) = { ", ck);
calc_follow[point1][point2++] = ck;
for (i = 0 + km; i < m; i++)
{
int lark = 0, chk = 0;
for (lark = 0; lark < point2; lark++)
{
if (f[i] == calc_follow[point1][lark])
{
chk = 1;
break;
}
}
if (chk == 0)
{
printf("%c, ", f[i]);
calc_follow[point1][point2++] = f[i];
}
}
printf(" }\n\n");
km = m;
point1++;
}
}
void follow(char c)
{
int i, j;
if (production[0][0] == c)
{
f[m++] = '$';
}
for (i = 0; i < 10; i++)
{
for (j = 2; j < 10; j++)
{
if (production[i][j] == c)
{
if (production[i][j + 1] != '\0')
{
followfirst(production[i][j + 1], i, (j + 2));
}

if (production[i][j + 1] == '\0' && c != production[i][0])


{
follow(production[i][0]);
}
}
}
}
}
void findfirst(char c, int q1, int q2)
{
int j;
if (!(isupper(c)))
{
first[n++] = c;
}
for (j = 0; j < count; j++)
{
if (production[j][0] == c)
{
if (production[j][2] == '#')
{
if (production[q1][q2] == '\0')
first[n++] = '#';
else if (production[q1][q2] != '\0' && (q1 != 0 || q2 != 0))
{
findfirst(production[q1][q2], q1, (q2 + 1));
}
else
first[n++] = '#';
}
else if (!isupper(production[j][2]))
{
first[n++] = production[j][2];
}
else
{
findfirst(production[j][2], j, 3);
}
}
}
}
void followfirst(char c, int c1, int c2)
{
int k;
if (!(isupper(c)))
f[m++] = c;
else
{
int i = 0, j = 1;
for (i = 0; i < count; i++)
{
if (calc_first[i][0] == c)
break;
}
while (calc_first[i][j] != '!')
{
if (calc_first[i][j] != '#')
{
f[m++] = calc_first[i][j];
}
else
{
if (production[c1][c2] == '\0')
{
follow(production[c1][0]);
}
else
{
followfirst(production[c1][c2], c1, c2 + 1);
}
}
j++;
}
}
}
Output:

Conclusion: Hence we have studied about and implemented a program for computing FIRST and FOLLOW of
non-terminals.

You might also like