You are on page 1of 129

Grammars

1
Grammars
Grammars express languages

Example: the English language


sentence  noun _ phrase predicate

noun _ phrase  article noun

predicate  verb 2
article  a
article  the

noun  boy
noun  dog

verb  runs
verb  walks 3
A derivation of “the boy walks”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 the noun verb
 the boy verb
 the boy walks
4
A derivation of “a dog runs”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 a noun verb
 a dog verb
 a dog runs
5
Language of the grammar:

L = { “a boy runs”,
“a boy walks”,
“the boy runs”,
“the boy walks”,
“a dog runs”,
“a dog walks”,
“the dog runs”, 6
Notation
noun  boy
noun  dog

Variable Terminal
Production
or
rule
Non-terminal

7
Another Example

Grammar:
S  aSb
S 

Derivation of sentence
ab :

S  aSb  ab

S  aSb S 
8
S  aSb
Grammar: S 

Derivation of sentence
aabb :
S  aSb  aaSbb  aabb

S  aSb S  9
S  aSb  aaSbb  aaaSbbb  aaabbb
Other derivations:

S  aSb  aaSbb  aaaSbbb


 aaaaSbbbb  aaaabbbb

10
Language of the grammar:

L = { “a boy runs”,
“a boy walks”,
“the boy runs”,
“the boy walks”,
“a dog runs”,
“a dog walks”,
“the dog runs”,
“the dog walks” } 11
Language of the grammar

S  aSb
S 

n n
L  {a b : n  0}

12
More Notation

G   V , T , P, S 
Grammar
V : A finiteSet of variables

T : A finiteSet of terminal symbols


S : Start variable

P: Set of Production rules


13

V and T are assumed to be disjoint


Example
G S  aSb
Grammar : S 

G   V , T , P, S 

V  {S } T  {a, b}
P  {S  aSb, S  }
14
More Notation

Sentential Form:
A sentence that contains
variables and terminals

Example:
S  aSb  aaSbb  aaaSbbb  aaabbb

Sentential Forms sentence


15
*
S  aaabbb
We write:

Instead of:
S  aSb  aaSbb  aaaSbbb  aaabbb
16
*
w1  wn
In general we write:

w1  w2  w3    wn
If:

17
*
w  w
By default:

18
Example
Grammar Derivations
S  aSb *
S 
S 
*
S  ab
*
S  aabb
*
S  aaabbb
19
Example
Grammar Derivations
S  aSb 
S  aaSbb
S 

aaSbb  aaaaaSbbbbb

20
Another Grammar Example
G S  Ab
Grammar : A  aAb
A
Derivations:

S Ab  aAbb  abb


S  Ab  aAbb  aaAbbb  aabbb
21
Example
Grammar Derivations
S  aSb 
S  aaSbb
S 

aaSbb  aaaaaSbbbbb

22
More Derivations
S  Ab  aAbb  aaAbbb  aaaAbbbb
 aaaaAbbbbb  aaaabbbbb


S  aaaabbbbb

S  aaaaaabbbbbbb

n n
S a b b 23
Language of a Grammar

For a grammar G
with start variable S :

L(G )  {w : S  w}

String of terminals
24
Example
G S  Ab
For grammar : A  aAb
A

n n
L(G )  {a b b : n  0}


n n
Since: S a b b
25
A Convenient Notation
A  aAb
A  aAb | 
A

article  a
article  a | the
article  the
26
Linear Grammars

27
Linear Grammars

Grammars with
at most one variable at the right side
of a production

S  aSb S  Ab
Examples: S  A  aAb
A
28
A Non-Linear Grammar
Grammar G : S  SS
S 
Why S SS
S  aSb
What will happen if
S  bSa
S S instead

L(G )  {w : na ( w)  nb ( w)}

Number of a in string w 29
Another Linear Grammar
G SA
Grammar : A  aB | 
B  Ab

n n
L(G )  {a b : n  0}
30
Right-Linear Grammars

All productions have form: A  xB


or A x

string of
Example: terminals
S  abS
S a
31
Left-Linear Grammars

All productions have form:


A  Bx
or
A x

string of
S  Aab
Example: terminals
A  Aab | B
Ba 32
Regular Grammars

33
Regular Grammars

A regular grammar is any


right-linear or left-linear grammar

Examples: G1 G2
S  abS S  Aab
S a A  Aab | B
Ba 34
Observation

Regular grammars generate regular


languages G2
G1
Examples: S  Aab
S  abS A  Aab | B
S a Ba

L(G1 )  (ab) * a L(G2 )  aab(ab) *


35
Regular Grammars
Generate
Regular Languages

36
Theorem
Languages
Generated by
Regular Grammars
 Regular
Languages

37
Theorem - Part 1

Languages
Generated by  Regular
Languages
Regular Grammars

Any regular grammar generates


a regular language

38
Theorem - Part 2

Languages
Generated by  Regular
Languages
Regular Grammars

Any regular language is generated


by a regular grammar

39
Proof – Part 1
Languages
Generated by  Regular
Languages
Regular Grammars

The language L(G ) generated by


any regular grammar G is regular

40
The case of Right-Linear
Grammars

Let Gbe a right-linear grammar

We will prove: L(G ) is regular

L( M )  L(G )

Proof idea: We will construct NFA M


41

with L(M)=L(G)
Grammar Gis right-linear
Example:
S  aA | B
A  aa B
Bb B|a

42
Construct NFA M such that
every state is a grammar variable:

A special
S VF
final state
B
S  aA | B
A  aa B
Bb B|a 43
Add edges for each production:
a A
S VF
B

S  aA
44
a A
S VF

B

S  aA | B
45
A
a a

S a VF

B
S  aA | B
A  aa B
46
A
a a

S a VF

B
S  aA | B
b
A  aa B
B  bB 47
A
a a

S a VF
 a
B
S  aA | B
b
A  aa B
B  bB | a 48
A
a a

S a VF
 a
B

S  aA  aaaB  aaabB  aaaba 49


NFA M Grammar
A G
a S  aA | B
a
A  aa B
S a B  bB | a
 VF
a
B
L( M )  L(G ) 
b
aaab * a  b * a
50
In General

A right-linear grammar G

has variables: V0 ,V1,V2 ,

and productions: Vi  a1a2  amV j


or
Vi  a1a2 am
51
We construct the NFA M such that:

each variable Vicorresponds to a


node: V1 V3
V0
VF
V2 special
V4
final state
52
Vi  a1a2 amV j
For each production:

we add transitions and intermediate


nodes
Vi a1 a 2 ………
am Vj

53
For each production: Vi  a1a2 am
we add transitions and intermediate
nodes
Vi a1 a 2 ………
a m VF

54
Resulting NFA M looks like this:
a9

a2 a4
a1 V1 V3
a3 a5
V0
a3 a4
VF
a8 a9
V2 a5
V4

It holds that: L(G )  L( M ) 55


The case of Left-Linear
Grammars

Let G be a left-linear grammar

We will prove: L (G ) is regular


Proof idea:
We will construct a right-linear
grammar G with R
L(G )  L(G)
56
Since G is left-linear grammar
the productions look like:

A  Ba1a2 ak

A  a1a2  ak

57
Left G
A  Ba1a2  ak
linear
Construct right-linear grammar G

A  Bv

Right A  ak  a2a1B
G
linear
R
Av B 58
Construct right-linear grammar G
Left A  a1a2  ak
G
linear
Av

Right
G A  ak a2a1
linear
R
Av 59
R
L(G )  L(G)
It is easy to see that:
G
Since is right-linear, we have:
R
L(G) L(G) L(G )
Regular Regular Regular
Language Language Language

60
Proof - Part 2
Languages
Generated by  Regular
Languages
Regular Grammars

Any regular language L is generated


by some regular grammar G

61
Any regular language L is generated
by some regular grammar G

Proof idea:
Let M be the NFA with L  L(M ).

Construct from M a regular grammar G


such that L( M )  L(G )
62
Since L is regular
there is an NFA M such that L  L(M )
b
Example:
M a
a
q0 q1 q2

 b
L  ab * ab(b * ab) * q3
L  L(M ) 63
Convert M to a right-linear grammar
b

M a
a
q0 q1 q2

q0  aq1  b
q3

64
b

M a
a
q0 q1 q2
q0  aq1
 b
q1  bq1 q3
q1  aq2

65
b

M a
a
q0 q1 q2
q0  aq1
q1  bq1  b

q1  aq2 q3

q2  bq3
66
L(G )  L( M )  L
G b

q0  aq1 M a
a
q0 q1 q2
q1  bq1
q1  aq2  b

q2  bq3 q3

q3  q1
q3   67
In General
a
For any transition: q p

Add production: q  ap

variable terminal variable


68
For any final state: qf

Add production: qf 

69
Since G is right-linear grammar

G is also a regular grammar

with L(G )  L( M )  L
70
For any regular language one can
construct left linear as well as right
linear grammar.

71
A
a a

a VF
S  a F → Ba
B B → Bb here F is

S  aA | B B → Aaa
start symbol

b
A  aa B B→S
A → Sa

B  bB | a S →  this
72 is new rule
as S is now a final state
A → aB D → Cb

B → bB D → B
C→B a
B → aC
B →B b Now D is the
C→b D start variable

D → B B → Aa
A →
D→

73
Language generated by both

aaab* a +b*a

74
75
76
77
Context-Free Languages

78
Context-Free Languages
R
n n
{a b } {ww }

Regular Languages

79
Context-Free Languages

Context-Free Pushdown
Grammars Automata

stack

automaton
80
Context-Free Grammars

81
Example
A context-free grammar G: S  aSb
S 

A derivation:

S  aSb  aaSbb  aabb


82
S  aSb
S 

n n
L(G )  {a b : n  0}

(((( ))))
83
A context-free grammar G: S  aSa
S  bSb
S 

Another derivation:

S  aSa  abSba  abaSaba  abaaba


84
S  aSa
S  bSb
S 

R
L(G )  {ww : w  {a, b}*}

85
Example

A context-free grammar G: S  aSb


S  SS
S 

A derivation:

S  SS  aSbS  abS  ab
86
A context-free grammar G: S  aSb
S  SS
S 

A derivation:

S  SS  aSbS  abS  abaSb  abab


87
S  aSb
S  SS
S 

L(G )  {w : na ( w)  nb ( w),
and na (v)  nb (v)
in any prefix v}

() ((( ))) (( )) 88
Definition: Context-Free
Grammars
Grammar G  (V , T , P, S )

Variables Terminal Start


symbols variable

Productions of the form:


A x
x is string of variables and terminals 89
Definition: Context-Free
Languages

A language L is context-free

if and only if

there is a C.F.grammar Gwith

L  L(G )
90
Derivation Order
1. S  AB 2. A  aaA 4. B  Bb
3. A   5. B  
Leftmost derivation:
1 2 3 4 5
S  AB  aaAB  aaB  aaBb  aab

Rightmost derivation:
1 4 5 2 3
S  AB  ABb  Ab  aaAb  aab 91
S  aAB
A  bBb
B  A|
Leftmost derivation:
S  aAB  abBbB  abAbB  abbBbbB
 abbbbB  abbbb
Rightmost derivation:
S  aAB  aA  abBb  abAb
 abbBbb  abbbb 92
Derivation Trees

93
Def:
G =(V,T,P,S)
An ordered tree is a derivation tree for G iff
1 The root is labeled S
2 every leaf has a label from T{ }
3 Every interior vertex has a label from V
4 If a vertex has label A(variable) and its children
are labeled from (L to R)
a1,a2,…an then P must contain a production
A a1 a2 …an
5 A leaf labeled  has no sibling.
94
Yield:

The string of terminals obtained by


reading the leaves of the tree from left
to right omitting any ’s encountered
is called yield of the tree.

95
Partial derivation tree

A tree that has properties 3,4,5 but 1


need not
And 2 is replaced by
V  T  { }

96
S  AB A  aaA |  B  Bb | 

S  AB
S

A B

97
S  AB A  aaA |  B  Bb | 

S  AB  aaAB
S

A B

a a A

98
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb
S

A B

a a A B b

99
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb


S

A B

a a A B b

 100
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B

a a A B b

  101
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B
yield

a a A B b aab
 aab
  102
Partial Derivation Trees
S  AB A  aaA |  B  Bb | 

S  AB
Partial derivation tree S

A B
103
S  AB  aaAB

Partial derivation tree S

A B

a a A

104
sentential
S  AB  aaAB
form

Partial derivation tree S

A B

yield
a a A
aaAB
105
Sometimes, derivation order doesn’t matter
Leftmost:
S  AB  aaAB  aaB  aaBb  aab
Rightmost:
S  AB  ABb  Ab  aaAb  aab
S
Same derivation tree
A B

a a A B b

106

 
Ambiguity

107
E  E  E | E  E | (E) | a
a  aa

E E  E  E  a E  a EE
 a  a E  a  a*a
E  E
leftmost derivation

a E  E

a a 108
E  E  E | E  E | (E) | a
a  aa

E  EE  E  EE  a EE E


 a  aE  a  aa
E  E
leftmost derivation

E  E a

a 109 a
E  E  E | E  E | (E) | a
a  aa
Two derivation trees
E E

E  E E  E

a E  E E  E a

a a a 110 a
The grammar E  E  E | E  E | (E) | a
is ambiguous:

string a  a  a has two derivation trees

E E

E  E E  E

a E  E E  E a

a a a a
111
The grammar E  E  E | E  E | (E) | a
is ambiguous:

string a  a  a has two leftmost derivations

E  E  E  a E  a EE
 a  a E  a  a*a

E  EE  E  EE  a EE


 a  aE  a  aa 112
Definition:
A context-free grammar G is ambiguous

if some string w L(G ) has:

two or more derivation trees

113
In other words:

A context-free grammar G is ambiguous

if some string w L(G ) has:

two or more leftmost derivations


(or rightmost)

114
Why do we care about ambiguity?

a  aa
take a2
E E

E  E E  E

a E  E E  E a

a a a 115 a
2  22

E E

E  E E  E

2 E  E E  E 2

2 2 2 116 2
2  22  6 2  22  8
6 8
E E
2 4 4 2
E  E E  E
2 2 2 2
2 E  E E  E 2

2 2 2 117 2
Correct result: 2  22  6

6
E
2 4
E  E
2 2
2 E  E

2 2 118
• Ambiguity is bad for programming languages

• We want to remove ambiguity

119
We fix the ambiguous grammar:
E  E  E | E  E | (E) | a

New non-ambiguous grammar: E  E T


E T
T T F
T F
F  (E)
120

F a
E  E T T T  F T  a T  a T F
 a  F F  a  aF  a  aa
E a  aa
E  E T
E  T
E T
T T F T T  F
T F
F F a
F  (E)
F a a a
121
Unique derivation tree

E a  aa
E  T

T T  F

F F a

a a
122
The grammar G: E  E T
E T
T T F
T F
F  (E)
F a
is non-ambiguous:
Every string w L(G ) has
a unique derivation tree 123
Inherent Ambiguity

Some context free languages


have only ambiguous grammars
n n m n m m
Example: L  {a b c }  {a b c }

S  S1 | S 2 S1  S1c | A S2  aS2 | B
A  aAb |  B  bBc | 
124
n n n
The string a b c
has two derivation trees

S S

S1 S2

S1 c a S2

125
Compiler

Lexical
parser
analyzer

input output

machine
program
code
126
A parser knows the grammar
of the programming language

127
The parser finds the derivation
of a particular input

derivation
Parser
input E => E + E
E -> E + E
=> E + E * E
10 + 2 * 5 |E*E
=> 10 + E*E
| INT
=> 10 + 2 * E
=> 10 + 2 * 5
128
derivation tree
derivation
E

E => E + E E + E
=> E + E * E
=> 10 + E*E 10
E * E
=> 10 + 2 * E
=> 10 + 2 * 5 2 5
129

You might also like