You are on page 1of 19

Context-Free Grammar (CFG)

CFG stands for context-free grammar. It I a formal grammar that is used to generate all
possible patterns of strings in a given formal language. Context-free grammar G can be
defined by four tuples as:
1. G = (V, T, P, S)  
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate
the string of a language.
T is the final set of a terminal symbol. It is denoted by lower case letters.
V is the final set of a non-terminal symbol. It is denoted by capital letters.
P is a set of production rules, which is used for replacing non-terminals symbols(on the
left side of the production) in a string with other terminal or non-terminal symbols(on
the right side of the production).
S is the start symbol which is used to derive the string. We can derive the string by
repeatedly replacing a non-terminal by the right-hand side of the production until all
non-terminal have been replaced by terminal symbols.
Example 1:
Construct the CFG for the language having any number of a's over the set ∑= {a}.
Solution:
As we know the regular expression for the above language is
1. r.e. = a*     
Production rule for the Regular expression is as follows:
1. S → aS    rule 1  
2. S → ε     rule 2  
Now if we want to derive a string "aaaaaa", we can start with start symbols.
1.  S  
2. aS   
3. aaS          rule 1  
4. aaaS         rule 1  
5. aaaaS        rule 1  
6. aaaaaS       rule 1  
7. aaaaaaS      rule 1  
8. aaaaaaε      rule 2  
9. aaaaaa  
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a null string
because S is a start symbol and rule 2 gives S → ε.
Example 2:
Construct a CFG for the regular expression (0+1)*
Solution:
The CFG can be given by,
1. Production rule (P):  
2. S → 0S | 1S  
3. S → ε  
The rules are in the combination of 0's and 1's with the start symbol. Since (0+1)*
indicates {ε, 0, 1, 01, 10, 00, 11, ....}. In this set, ε is a string, so in the rule, we can set the
rule S → ε.
Example 3:
Construct a CFG for a language L = {wcwR | where w € (a, b)*}.
Solution:
The string that can be generated for a given language is {aacaa, bcb, abcba, bacab,
abbcbba, ....}
The grammar could be:
1. S → aSa     rule 1  
2. S → bSb     rule 2  
3. S → c       rule 3  
Now if we want to derive a string "abbcbba", we can start with start symbols.
1. S → aSa   
2. S → abSba       from rule 2  
3. S → abbSbba     from rule 2  
4. S → abbcbba     from rule 3  
Thus any of this kind of string can be derived from the given production rules.
Example 4:
Construct a CFG for the language L = anb2n where n>=1.
Solution:
The string that can be generated for a given language is {abb, aabbbb, aaabbbbbb....}.
The grammar could be:
1. S → aSbb | abb  
Now if we want to derive a string "aabbbb", we can start with start symbols.
1. S → aSbb  
2. S → aabbbb    

Derivation
Derivation is a sequence of production rules. It is used to get the input string through
these production rules. During parsing, we have to take two decisions. These are as
follows:
o We have to decide the non-terminal which is to be replaced.
o We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be placed with production rule.
1. Leftmost Derivation:
In the leftmost derivation, the input is scanned and replaced with the production rule
from left to right. So in leftmost derivation, we read the input string from left to right.
Example:
Production rules:
1. E = E + E  
2. E = E - E  
3. E = a | b  
Input
1. a - b + a  
The leftmost derivation is:
1. E = E + E  
2. E = E - E + E  
3. E = a - E + E  
4. E = a - b + E  
5. E = a - b + a  
2. Rightmost Derivation:
In rightmost derivation, the input is scanned and replaced with the production rule from
right to left. So in rightmost derivation, we read the input string from right to left.
Example
Production rules:
1. E = E + E  
2. E = E - E  
3. E = a | b  
Input
1. a - b + a  
The rightmost derivation is:
1. E = E - E  
2. E = E - E + E  
3. E = E - E + a  
4. E = E - b + a  
5. E = a - b + a  
When we use the leftmost derivation or rightmost derivation, we may get the same
string. This type of derivation does not affect on getting of a string.
Examples of Derivation:
Example 1:
Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG
given by,
1. S → AB | ε  
2. A → aB  
3. B → Sb  
Solution:
Leftmost derivation:

Rightmost derivation:
Example 2:
Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a
CFG given by,
1. S → aB | bA  
2.  S → a | aS | bAA  
3. S → b | aS | aBB  
Solution:
Leftmost derivation:
1. S  
2. aB            S → aB      
3. aaBB          B → aBB  
4. aabB          B → b  
5. aabbS         B → bS  
6. aabbaB        S → aB  
7. aabbabS       B → bS  
8. aabbabbA      S → bA  
9. aabbabba      A → a  
Rightmost derivation:
1. S  
2. aB            S → aB      
3. aaBB          B → aBB  
4. aaBbS         B → bS  
5. aaBbbA        S → bA  
6. aaBbba        A → a  
7. aabSbba       B → bS  
8. aabbAbba      S → bA  
9. aabbabba      A → a  
Example 3:
Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG
given by,
1. S → A1B  
2. A → 0A | ε  
3. B → 0B | 1B | ε  
Solution:
Leftmost derivation:
1. S  
2. A1B  
3. 0A1B  
4. 00A1B  
5. 001B  
6. 0010B  
7. 00101B  
8. 00101  
Rightmost derivation:
1. S  
2. A1B  
3. A10B  
4. A101B  
5. A101  
6. 0A101  
7. 00A101  
8. 00101  

Derivation Tree
Derivation tree is a graphical representation for the derivation of the given production
rules for a given CFG. It is the simple way to show how the derivation can be done to
obtain some string from a given set of production rules. The derivation tree is also called
a parse tree.
Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So,
the operator in the parent node has less precedence over the operator in the sub-tree.
A parse tree contains the following properties:
1. The root node is always a node indicating start symbols.
2. The derivation is read from left to right.
3. The leaf node is always terminal nodes.
4. The interior nodes are always the non-terminal nodes.
Example 1:
Production rules:
1. E = E + E  
2. E = E * E  
3. E = a | b | c  
Input
1. a * b + c  
Step 1:

Step 2:

Step 2:
Step 4:

Step 5:
Note: We can draw a derivation tree step by step or directly in one step.
Example 2:
Draw a derivation tree for the string "bab" from the CFG given by
1. S → bSb | a | b  
Solution:
Now, the derivation tree for the string "bbabb" is as follows:

The above tree is a derivation tree drawn for deriving a string bbabb. By simply reading
the leaf nodes, we can obtain the desired string. The same tree can also be denoted by,
Example 3:
Construct a derivation tree for the string aabbabba for the CFG given by,
1. S → aB | bA  
2. A → a | aS | bAA  
3. B → b | bS | aBB  
Solution:
To draw a tree, we will first try to obtain derivation for the string aabbabba

Now, the derivation tree is as follows:


Example 4:
Show the derivation tree for string "aabbbb" with the following grammar.
1. S → AB | ε  
2. A → aB  
3. B → Sb  
Solution:
To draw a tree we will first try to obtain derivation for the string aabbbb
Now, the derivation tree for the string "aabbbb" is as follows:
Ambiguity in Grammar
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivation or more than one parse tree for the given input
string. If the grammar is not ambiguous, then it is called unambiguous.
If the grammar has ambiguity, then it is not good for compiler construction. No method
can automatically detect and remove the ambiguity, but we can remove ambiguity by
re-writing the whole grammar without ambiguity.
Example 1:
Let us consider a grammar G with the production rule
1. E → I  
2. E → E + E  
3. E → E * E  
4. E → (E)  
5. I → ε | 0 | 1 | 2 | ... | 9  
Solution:
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is
ambiguous.
Example 2:
Check whether the given grammar G is ambiguous or not.
1. E → E + E  
2. E → E - E  
3. E → id  
Solution:
From the above grammar String "id + id - id" can be derived in 2 ways:
First Leftmost derivation
1. E → E + E  
2.    → id + E  
3.    → id + E - E  
4.    → id + id - E  
5.    → id + id- id  
Second Leftmost derivation
1. E → E - E  
2.    → E + E - E  
3.    → id + E - E  
4.    → id + id - E  
5.    → id + id - id  
Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.
Example 3:
Check whether the given grammar G is ambiguous or not.
1. S → aSb | SS  
2. S → ε  
Solution:
For the string "aabb" the above grammar can generate two parse trees

Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.
Example 4:
Check whether the given grammar G is ambiguous or not.
1. A → AA  
2. A → (A)  
3. A → a  
Solution:
For the string "a(a)aa" the above grammar can generate two parse trees:

Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.
Unambiguous Grammar
A grammar can be unambiguous if the grammar does not contain ambiguity that means
if it does not contain more than one leftmost derivation or more than one rightmost
derivation or more than one parse tree for the given input string.
To convert ambiguous grammar to unambiguous grammar, we will apply the following
rules:
1. If the left associative operators (+, -, *, /) are used in the production rule, then apply
left recursion in the production rule. Left recursion means that the leftmost symbol on
the right side is the same as the non-terminal on the left side. For example,
1. X → Xa  
2. If the right associative operates(^) is used in the production rule then apply right
recursion in the production rule. Right recursion means that the rightmost symbol on
the left side is the same as the non-terminal on the right side. For example,
1. X → aX  
Example 1:
Consider a grammar G is given as follows:
1. S → AB | aaB  
2. A → a | Aa  
3. B → b  
Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an
unambiguous grammar equivalent to G.
Solution:
Let us derive the string "aab"
As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
1. S → AB  
2. A → Aa | a  
3. B → b  
Example 2:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous
grammar.
1. S → ABA  
2. A → aA | ε  
3. B → bB | ε  
Solution:
The given grammar is ambiguous because we can derive two different parse tree for
string aa.

The unambiguous grammar is:


1. S → aXY | bYZ | ε  
2. Z → aZ | a  
3. X → aXY | a | ε  
4. Y → bYZ | b | ε  
Example 3:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous
grammar.
1. E → E + E  
2. E → E * E  
3. E → id  
Solution:
Let us derive the string "id + id * id"

As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
1. E → E + T  
2. E → T  
3. T → T * F  
4. T → F  
5. F → id  
Example 4:
Check that the given grammar is ambiguous or not. Also, find an equivalent
unambiguous grammar.
1.  S → S + S  
2. S → S * S  
3. S → S ^ S  
4. S → a  
Solution:
The given grammar is ambiguous because the derivation of string aab can be
represented by the following string:

Unambiguous grammar will be:


1. S → S + A |  
2. A → A * B | B  
3. B → C ^ B | C  
4. C → a  

You might also like