You are on page 1of 61

Formal Languages And Automata

Theory(CoSc3111)

Chapter 4

Context-Free Grammars(CFG)
Grammars
• Grammars are a means of describing languages.
– According to Noam Chomsky, there are four types of
grammars
– Type 0,
– Type 1,
– Type 2, and
– Type 3.
Type - 3 Grammar

• Type-3 grammars generate regular languages.


• Type-3 grammars must have a single non terminal on the left-
hand side and a right-hand side consisting of a single terminal
or single terminal followed by a single non-terminal.
• The productions must be in the form X → a or X → aY, where
X, Y ∈ N Nonterminal and a ∈ T Terminal.
• The rule S → ε is allowed if S does not appear on the right side
of any rule.
Type - 2 Grammar

• Type-2 grammars generate context-free languages.


• The productions must be in the form A → γ where A ∈ N
Non-terminal and γ ∈ T ∪ N* string of terminals and non -
terminals.
• The languages generated by these grammars are recognized
by a non-deterministic pushdown automaton
Type - 1 Grammar

• Type-1 grammars generate context-sensitive languages.

• The productions must be in the form αAβ → αγβ, where A∈N


Non –terminal and α, β, γ ∈ T ∪ N* Strings of terminals and
non – terminals.
• The strings α and β may be empty, but γ must be non empty.
• The rule S → ε is allowed if S does not appear on the right
side of any rule.
• The languages generated by these grammars are recognized
by a linear bounded automaton.
Type - 0 Grammar
• Type-0 grammars generate recursively enumerable languages.

• The productions have no restrictions.


• They are any phase structure grammar including all formal grammars.
• They generate the languages that are recognized by a Turing machine.

• The productions can be in the form of α → β where α is a string of


terminals and non-terminals with at least one non-terminal and α
cannot be null and β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
Generally;
What are Context Free Grammars?

 Context free Grammar (CFG) is a formal grammar in which


every production rule is of the form

V W, Where V is a single non terminal symbol and


w is a string of terminals and non terminals (w can be
empty)
 The languages generated by context free grammars are
knows as the context free languages
Formal Definition of CFG

 A context-free grammar G is a 4-tuple (N, , R, S),

where:
Con…
Con..
What does CFG do?
 CFG provides a simple and mathematically precise
mechanism for describing smaller blocks of phrases in some
natural language, and
 Capturing the “block structure” of sentences in a natural way.

Example:
All the morning flights from Denver to Tampa leaving before 10
Example
 Given a grammar G = ({S}, {a, b}, R, S).
The set of rules R is
S →aSb
S →SS
S →∈
 This grammar generates strings such as abab, aaabbb,
aababb
 If we assume that a is left parenthesis ‘(’ and b is right
parenthesis ‘)’, then L(G) is the language of all strings of
properly nested parentheses.

=> ()() ((())) (()()) a=(, b=)


Example

Give some example of context-free languages.


(a) The grammar G = ({S}, {a, b}, S, P) with productions
S → aSa
S → bSb
S →λ is context free.
S ⇒ aSa ⇒ aaSaa ⇒ aabSbaa ⇒ aabbaa
Thus we have L(G) = {w w R :w ∈{a, b }* }.
This language is context free.
Con…
 The grammar G, with production rules given by

S→ abB
A →aaBb
B→ bbAa
A→λ is context free.
 The language is

 L(G) = {ab (bbaa)n bba(ba)n : n ≥ 0}


Right-Linear Grammar


In general productions have the form:

(V ∪T)+ → ( V ∪ T )*.


In right-linear grammar, all productions have one of the two forms:

V → T *V or V → T *

i.e., the left hand side should have a single variable and the right hand side

consists of any number of terminals (members of T) optionally followed by a

single variable.
Right-Linear Grammar
Left-Linear Grammar

 In a left-linear grammar, all productions have one of

the two forms:


V → V T * or

V→T*
:
Conversion of Left-linear Grammar into Right-Linear
Grammar

Algorithm

• If the left linear grammar has a rule with the start symbol S
on the right hand side, simply add this rule:
S0 → S
Symbols used by the algorithm
• Let S denote the start symbol
• Let A, B denote non-terminal symbols
• Let p denote zero or more terminal symbols
• Let ε denote the empty symbol

22
Algorithm
1) If the left linear grammar has a rule S → p, then make that a
rule in the right linear grammar
2) If the left linear grammar has a rule A → p, then add the
following rule to the right linear grammar: S → pA

3) If the left linear grammar has a rule B → Ap, add the following
rule to the right linear grammar: A → pB

4) If the left linear grammar has a rule S → Ap, then add the
following rule to the right linear grammar: A → p

5) If the left linear grammar has a rule S → A, then add the


following rule to the right linear grammar: A → 23
Convert this left linear grammar

left linear
S → Aa
A → ab
right linear
left linear
S → abA
S → Aa
A → ab

2) If the left linear grammar has this rule A → p, then add


the following rule to the right linear grammar: S → pA
24
Right hand side of S has non-terminal
left linear right linear

S → Aa S → abA
A → ab A→a
4) If the left linear grammar has S → Ap, then add the
following rule to the right linear grammar: A → p
left linear right linear

S → Aa S → abA
A → ab A→a
Both grammars generate this language: {aba} 25
Convert this left linear grammar

original grammar left linear

S → Ab S0 → S
S → Sb S → Ab
A → Aa make a new
start symbol S → Sb
A→a A → Aa
A→a

Convert this

26
Right hand side has terminals

left linear right linear

S0 → S S0 → aA
S → Ab
S → Sb
A → Aa
A→a
2) If the left linear grammar has this rule A → p, then add the
following rule to the right linear grammar: S → pA

27
Right hand side has non-terminal

left linear right linear

S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a

3) If the left linear grammar has a rule B → Ap, add the


following rule to the right linear grammar: A → pB
28
Right hand side of start symbol has non-
terminal

left linear right linear

S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε

4) If the left linear grammar has S → Ap, then add the


following rule to the right linear grammar: A → p
29
Equivalent!

left linear right linear

S0 → S S0 → aA
S → Ab A → bS
S → Sb A → aA
A → Aa S → bS
A→a S→ε

Both grammars generate this language: {a+b+}

30
Derivation Trees
Def:
G =(V,T,P,S)

An ordered tree is a derivation tree for G iff

1 The root is labeled S

2 every leaf has a label from T{ }

3 Every interior vertex has a label from V

4 If a vertex has label A(variable) and its children are labeled from (L to R)

a1,a2,…an then P must contain a production

A a1 a2 …an

5 A leaf labeled  has no sibling.

31
• Yield:

• The string of terminals obtained by reading the leaves


of the tree from left to right omitting any ’s
encountered is called yield of the tree.

Partial derivation tree

A tree that has properties 3,4,5 but 1 need not

And 2 is replaced by V  T  { }

32
S  AB A  aaA |  B  Bb | 
.

S  AB
S

A B

33
S  AB A  aaA |  B  Bb | 

S  AB  aaAB
S

A B

a a A

34
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb
S

A B

a a A B b

35
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb


S

A B

a a A B b

 36
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B

a a A B b

  37
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B
yield

a a A B b aab
 aab
  38
Partial Derivation Trees
.
S  AB A  aaA |  B  Bb | 
S  AB
Partial derivation tree S

A B

39
S  AB  aaAB

• Partial derivation tree

A B

a a A

40
sentential
S  AB  aaAB
form

Partial derivation tree S

A B

yield
a a A
aaAB
41
Sometimes, derivation order doesn’t matter
Leftmost:
S  AB  aaAB  aaB  aaBb  aab

Rightmost:
S  AB  ABb  Ab  aaAb  aab
S
Same derivation tree
A B

a a A B b

  42
Ambiguity
Definition:
A context-free grammar
G is ambiguous
if some string w L(G ) has two or
more derivation trees

43
E  E  E | E  E | (E) | a
a  aa

E E  E  E  a E  a EE
 a  a E  a  a*a
E  E
leftmost derivation

a E  E

a a 44
E  E  E | E  E | (E) | a
a  aa

E  EE  E  EE  a EE E


 a  aE  a  aa
E  E
leftmost derivation

E  E a

a a 45
E  E  E | E  E | (E) | a
a  aa

Two derivation trees


E E

E  E E  E

a E  E E  E a

a a a a 46
The grammar E  E  E | E  E | (E) | a
is ambiguous:

string a  a  a has two derivation trees

E E

E  E E  E

a E  E E  E a

a a a a
47
The grammar E  E  E | E  E | (E) | a
is ambiguous:

string a  a  a has two leftmost derivations


E  E  E  a E  a EE
 a  a E  a  a*a

E  EE  E  EE  a EE


 a  aE  a  aa

48
Why do we care about ambiguity?

a  aa
take a2
E E

E  E E  E

a E  E E  E a

a a a a 49
2  22

E E

E  E E  E

2 E  E E  E 2

2 2 2 2 50
2  22  6 2  22  8
6 8

E E
2 4 4 2
E  E E  E
2 2 2 2
2 E  E E  E 2

2 2 2 2 51
Correct result: 2  22  6

6
E
2 4
E  E
2 2
2 E  E

2 2
52
• Ambiguity is bad for programming languages

• We want to remove ambiguity

53
We fix the ambiguous grammar:
E  E  E | E  E | (E) | a
E  E T
New non-ambiguous grammar: E T
T T F
T F
F  (E)
F a
54
E  E T T T  F T  a T  a T F
 a  F F  a  aF  a  aa
E a  aa
E  E T
E  T
E T
T T F T T  F
T F
F F a
F  (E)
F a a a
55
Unique derivation tree

E a  aa
E  T

T T  F

F F a

a a
56
The grammar G: E  E T
E T
T T F
T F
F  (E)
F a
is non-ambiguous:
Every string w L (G ) has
a unique derivation tree
57
Compiler

Lexical
analyzer parser

input output

machine
program
code
58
A parser knows the grammar of the
programming language

59
The parser finds the derivation
of a particular input

derivation
Parser
input E => E + E
E -> E + E
=> E + E * E
10 + 2 * 5 |E*E
=> 10 + E*E
| INT
=> 10 + 2 * E
=> 10 + 2 * 5

60
derivation tree
derivation
E

E => E + E E + E
=> E + E * E
=> 10 + E*E 10
=> 10 + 2 * E E * E
=> 10 + 2 * 5 2 5
61

You might also like