Professional Documents
Culture Documents
Theory(CoSc3111)
Chapter 4
Context-Free Grammars(CFG)
Grammars
• Grammars are a means of describing languages.
– According to Noam Chomsky, there are four types of
grammars
– Type 0,
– Type 1,
– Type 2, and
– Type 3.
Type - 3 Grammar
where:
Con…
Con..
What does CFG do?
CFG provides a simple and mathematically precise
mechanism for describing smaller blocks of phrases in some
natural language, and
Capturing the “block structure” of sentences in a natural way.
Example:
All the morning flights from Denver to Tampa leaving before 10
Example
Given a grammar G = ({S}, {a, b}, R, S).
The set of rules R is
S →aSb
S →SS
S →∈
This grammar generates strings such as abab, aaabbb,
aababb
If we assume that a is left parenthesis ‘(’ and b is right
parenthesis ‘)’, then L(G) is the language of all strings of
properly nested parentheses.
S→ abB
A →aaBb
B→ bbAa
A→λ is context free.
The language is
In general productions have the form:
(V ∪T)+ → ( V ∪ T )*.
In right-linear grammar, all productions have one of the two forms:
V → T *V or V → T *
i.e., the left hand side should have a single variable and the right hand side
single variable.
Right-Linear Grammar
Left-Linear Grammar
V→T*
:
Conversion of Left-linear Grammar into Right-Linear
Grammar
Algorithm
• If the left linear grammar has a rule with the start symbol S
on the right hand side, simply add this rule:
S0 → S
Symbols used by the algorithm
• Let S denote the start symbol
• Let A, B denote non-terminal symbols
• Let p denote zero or more terminal symbols
• Let ε denote the empty symbol
22
Algorithm
1) If the left linear grammar has a rule S → p, then make that a
rule in the right linear grammar
2) If the left linear grammar has a rule A → p, then add the
following rule to the right linear grammar: S → pA
3) If the left linear grammar has a rule B → Ap, add the following
rule to the right linear grammar: A → pB
4) If the left linear grammar has a rule S → Ap, then add the
following rule to the right linear grammar: A → p
left linear
S → Aa
A → ab
right linear
left linear
S → abA
S → Aa
A → ab
S → Aa S → abA
A → ab A→a
4) If the left linear grammar has S → Ap, then add the
following rule to the right linear grammar: A → p
left linear right linear
S → Aa S → abA
A → ab A→a
Both grammars generate this language: {aba} 25
Convert this left linear grammar
S → Ab S0 → S
S → Sb S → Ab
A → Aa make a new
start symbol S → Sb
A→a A → Aa
A→a
Convert this
26
Right hand side has terminals
S0 → S S0 → aA
S → Ab
S → Sb
A → Aa
A→a
2) If the left linear grammar has this rule A → p, then add the
following rule to the right linear grammar: S → pA
27
Right hand side has non-terminal
S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a
S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε
S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε
30
Derivation Trees
Def:
G =(V,T,P,S)
4 If a vertex has label A(variable) and its children are labeled from (L to R)
A a1 a2 …an
31
• Yield:
And 2 is replaced by V T { }
32
S AB A aaA | B Bb |
.
S AB
S
A B
33
S AB A aaA | B Bb |
S AB aaAB
S
A B
a a A
34
S AB A aaA | B Bb |
S AB aaAB aaABb
S
A B
a a A B b
35
S AB A aaA | B Bb |
A B
a a A B b
36
S AB A aaA | B Bb |
A B
a a A B b
37
S AB A aaA | B Bb |
A B
yield
a a A B b aab
aab
38
Partial Derivation Trees
.
S AB A aaA | B Bb |
S AB
Partial derivation tree S
A B
39
S AB aaAB
A B
a a A
40
sentential
S AB aaAB
form
A B
yield
a a A
aaAB
41
Sometimes, derivation order doesn’t matter
Leftmost:
S AB aaAB aaB aaBb aab
Rightmost:
S AB ABb Ab aaAb aab
S
Same derivation tree
A B
a a A B b
42
Ambiguity
Definition:
A context-free grammar
G is ambiguous
if some string w L(G ) has two or
more derivation trees
43
E E E | E E | (E) | a
a aa
E E E E a E a EE
a a E a a*a
E E
leftmost derivation
a E E
a a 44
E E E | E E | (E) | a
a aa
E E a
a a 45
E E E | E E | (E) | a
a aa
E E E E
a E E E E a
a a a a 46
The grammar E E E | E E | (E) | a
is ambiguous:
E E
E E E E
a E E E E a
a a a a
47
The grammar E E E | E E | (E) | a
is ambiguous:
48
Why do we care about ambiguity?
a aa
take a2
E E
E E E E
a E E E E a
a a a a 49
2 22
E E
E E E E
2 E E E E 2
2 2 2 2 50
2 22 6 2 22 8
6 8
E E
2 4 4 2
E E E E
2 2 2 2
2 E E E E 2
2 2 2 2 51
Correct result: 2 22 6
6
E
2 4
E E
2 2
2 E E
2 2
52
• Ambiguity is bad for programming languages
53
We fix the ambiguous grammar:
E E E | E E | (E) | a
E E T
New non-ambiguous grammar: E T
T T F
T F
F (E)
F a
54
E E T T T F T a T a T F
a F F a aF a aa
E a aa
E E T
E T
E T
T T F T T F
T F
F F a
F (E)
F a a a
55
Unique derivation tree
E a aa
E T
T T F
F F a
a a
56
The grammar G: E E T
E T
T T F
T F
F (E)
F a
is non-ambiguous:
Every string w L (G ) has
a unique derivation tree
57
Compiler
Lexical
analyzer parser
input output
machine
program
code
58
A parser knows the grammar of the
programming language
59
The parser finds the derivation
of a particular input
derivation
Parser
input E => E + E
E -> E + E
=> E + E * E
10 + 2 * 5 |E*E
=> 10 + E*E
| INT
=> 10 + 2 * E
=> 10 + 2 * 5
60
derivation tree
derivation
E
E => E + E E + E
=> E + E * E
=> 10 + E*E 10
=> 10 + 2 * E E * E
=> 10 + 2 * 5 2 5
61