You are on page 1of 10

Eliminating left-recursion: three steps Consider the CFG below.

S → XX | Y
Recall: A CFG is left-recursive if it includes a variable A s.t. X → aXb | ǫ
+ Y → aY b | Z
A⇒ Aα .
Z → bZa | ǫ
Notice that
∗ ∗ ∗ ∗
S⇒ ǫ X⇒ ǫ Y ⇒ ǫ Z⇒ ǫ
We eliminate left-recursion in three steps.
Hence, all the variables in this grammar are what we will call
• eliminate ǫ-productions (impossible to generate ǫ!) “nullable.” So in order to eliminate the ǫ-productions in this
grammar, we must alter the grammar to take into account the
+
• eliminate cycles (A ⇒ A)
fact that instances of these variables in a derivation may
• eliminate left-recursion eventually be replaced by ǫ. So, for instance, we will replace

Z → bZa | ǫ
with
So we’ve got some constructions to learn.
Z → bZa | ba .
After elimination of ǫ-productions, we obtain
Let’s try an example of eliminating ǫ-productions before we
specify a construction. . . S → XX | X | Y
X → aXb | ab
Y → aY b | ab | Z
Z → bZa | ba

1 2
Eliminating ǫ-productions ∗
A is nullable if A ⇒ ǫ.

Pǫ is the set of all productions


Given a CFG G = (V, Σ, S, P ), a variable A ∈ V is nullable if

A→β
A⇒ ǫ.
s.t. A 6= β, β 6= ǫ, and P includes a production
The main step in the ǫ-production elimination algorithm then
A→α
is that the set P of productions is replaced with the set Pǫ of
all productions s.t. β can be obtained from α by deleting zero or more
A→β occurrences of nullable variables.

s.t. A 6= β, β 6= ǫ, and P includes a production


Example Let’s apply ǫ-production elimination to
A→α
S → XZ
s.t. β can be obtained from α by deleting zero or more X → aXb | ǫ
occurrences of nullable variables. Z → aZ | ZX | ǫ

What are the nullable variables?


Example Applying ǫ-production elimination to the CFG
What are the new productions?
G = ({S}, {a, b}, S, { S → aSb | ǫ })

yields the CFG

Gǫ = ({S}, {a, b}, S, { S → aSb | ab }) .

3 4
Eliminating ǫ-productions can greatly increase the size of a Eliminating cycles
grammar.
A grammar has a cycle if there is a variable A s.t.
Example Eliminating ǫ-productions from
+
A⇒ A.
S → A1A2 · · · An
A1 → ǫ We’ll call such variables cyclic.
A2 → ǫ
... If a grammar has no ǫ-productions, then all cycles can be
An → ǫ eliminated from G without affecting the language generated,
using a construction we will specify in a moment.
increases the number of productions from n + 1 to 2n − 1.
Let’s try an example first. Consider the grammar with
productions
S → X | Xb | SS
What about ambiguity?
X → S|a
Claim If G is unambiguous, so is Gǫ . + +
We have S ⇒ S and X ⇒ X.

We can eliminate occurrences of cyclic variables as rhs’s of


productions, thus eliminating cycles.

S → a | Xb | SS
X → Xb | SS | a

5 6
Given a CFG G = (V, Σ, S, P ), the main step in the cycle Replace the set P of productions with the set Pc of
elimination algorithm is to replace the set P of productions productions obtained from P by replacing
with the set Pc of productions obtained from P by replacing
• each production A → B where B is cyclic
• each production A → B where B is cyclic • with new productions A → α s.t. α is not a cyclic variable

• with new productions A → α s.t. α is not a cyclic variable and there is a production C → α s.t. B ⇒ C.

and there is a production C → α s.t. B ⇒ C.
Let’s consider an example illustrating the importance of
The resulting CFG is Gc = (V, Σ, S, Pc).
requiring that there be no ǫ-productions.

Note: You can probably convince yourself that Pc has no S → a | SS | ǫ


cycles, and that Gc generates the same languages as G.
Notice that S is a cyclic variable, since
Let’s try the construction. . . S ⇒ SS ⇒ S .
S → X | Xb | Y a
X → Y |b But S never appears as the rhs of a production, so the
Y → X|a construction does nothing.

Which variables are cyclic?

Which productions will be replaced? (BTW What does the ǫ-elimination algorithm do to this
grammar?)
With what?

7 8
Eliminating “immediate” left recursion If
A → Aα1 | Aα2 | · · · | Aαm | β1 | β2| · · · | βn
Let’s begin with an easy example, already considered: represents all the A-productions of the grammar, and no βi
begins with A, then we can replace these A-productions by
A → Ab | b
A → β1A′ | β2 A′| · · · | βnA′
This grammar is left-recursive, since A′ → α1A′ | α2 A′ | · · · | αm A′ | ǫ
+
A⇒ Ab .
If our grammar has S-productions
In this case, we can eliminate left recursion as follows: S → SX | SSb | XS | a

A → bA′ we can replace them with


A′ → bA′ | ǫ S → XSS ′ | aS ′
S ′ → XS ′ | SbS ′ | ǫ

More generally, we can eliminate “immediate” left recursion as Notice that this construction can fail to eliminate
follows. If left-recursion if we have the production
A → Aα1 | Aα2 | · · · | Aαm | β1 | β2| · · · | βn A→A!
represents all the A-productions of the grammar, and no βi For instance,
begins with A, then we can replace these A-productions by A → A | Ab | b
becomes
A → β1A′ | β2 A′| · · · | βnA′
A → bA′
A′ → α1A′ | α2 A′ | · · · | αm A′ | ǫ
A′ → A′ | bA′ | ǫ

9 10
If If
A → Aα1 | Aα2 | · · · | Aαm | β1 | β2| · · · | βn A → Aα1 | Aα2 | · · · | Aαm | β1 | β2| · · · | βn
represents all the A-productions of the grammar, and no βi represents all the A-productions of the grammar, and no βi
begins with A, then we can replace these A-productions by begins with A, then we can replace these A-productions by

A → β1A′ | β2 A′| · · · | βnA′ A → β1A′ | β2 A′| · · · | βnA′


A′ → α1A′ | α2 A′ | · · · | αm A′ | ǫ A′ → α1A′ | α2 A′ | · · · | αm A′ | ǫ

Another interesting special case. What if there are no βi’s? Notice also that this construction works only “locally”:

For example, That is, indirect recursion is not eliminated.


A → AA | Ab .
For example, if we apply this construction to both variables in
Then everything that can be derived from A has a variable in S → SX | SSb | XS | a
it, so A cannot appear in a derivation of a sentence. X → Sa | Xb
we obtain
And the construction handles this in an interesting way, S → XSS ′ | aS ′
yielding S ′ → XS ′ | SbS ′ | ǫ
′ ′ ′
A → AA | bA | ǫ X → SaX ′
but no A-productions. X ′ → bX ′ | ǫ
+
and so have S ⇒ SaX ′ SS ′ , for instance.

11 12
Here’s an algorithm that eliminates all left-recursion for any Arrange the variables in some order A1 , A2, . . . , An .
CFG without ǫ-productions and without cycles. for i := 1 to n do begin
for j := 1 to i − 1 do begin
Arrange the variables in some order A1 , A2, . . . , An . replace each production of the form Ai → Aj γ
by the productions Ai → δ1 γ | δ2 γ | · · · | δk γ
for i := 1 to n do begin where Aj → δ1 | δ2 | · · · | δk are all the current Aj -productions;
for j := 1 to i − 1 do begin end
replace each production of the form Ai → Aj γ eliminate the immediate left recursion among the Ai -productions;
by the productions Ai → δ1 γ | δ2 γ | · · · | δk γ
end
where Aj → δ1 | δ2 | · · · | δk are all the current Aj -productions;
end So at this point we have grammar
eliminate the immediate left recursion among the Ai -productions;
S → XSS ′ | aS ′
end
S ′ → XS ′ | SbS ′ | ǫ
Consider the grammar X → Xb | Sa | b
S → SX | SSb | XS | a and the next obligation is to replace the production
X → Xb | Sa | b
X → Sa
Let’s order the variables S, X:
with the productions
The first time through we simply eliminate immediate left X → XSS ′ a | aS ′ a .
recursion in S-productions, yielding
We then eliminate immediate left recursion among
S → XSS ′ | aS ′
S ′ → XS ′ | SbS ′ | ǫ X → XSS ′ a | aS ′ a | Xb | b .
X → Xb | Sa | b

13 14
Eliminating immediate left recursion among Let’s look at examples showing that this algorithm can fail if
the grammar has ǫ-productions or cycles.
X → XSS ′ a | Xb | b | aS ′ a
In the simplest case, when there is only one variable, call it X,
yields
the presence of a cycle implies that the grammar includes the
X → bX ′ | aS ′ aX ′
production
X ′ → SS ′ aX ′ | bX ′ | ǫ
X →X.
So the final result is Moreover, the whole left recursion elimination algorithm
S → XSS ′ | aS ′ reduces to elimination of immediate left recursion among
′ ′ ′
S → XS | SbS | ǫ X-productions.
′ ′ ′
X → bX | aS aX
And we have previously observed that our construction for
X ′ → SS ′ aX ′ | bX ′ | ǫ
immediate left recursion elimination is no good in the presence
of X → X.

For example, if the grammar is

X→X |a

the construction for eliminating immediate left recursion yields


X → aX ′
X′ → X′

What about more complex cycles?

15 16
S → X|b Here’s an example with an ǫ-production and no cycles:
X → S|a
S → XSa | b
Try ordering S, X. X → ǫ

First step: eliminate immediate left recursion in Try order S, X.


S-productions.
First step: eliminate immediate left recursion in
There is none. S-productions.

Next: replace production There is none.

X→S Next: replace any X-productions whose rhs begins with S.


with productions There are none.
X → X | b.
Last: eliminate immediate left recursion in the current
It remains only to eliminate immediate left recursion in the X-productions.
current X-productions, which are
There is none.
X → X | b | a.
So our left-recursion elimination algorithm leaves the grammar
As before, the presence of production X → X breaks our unchanged.
construction — which yields
+
Yet S ⇒ Sa, so the grammar is left-recursive.
X → bX ′ | aX ′
X′ → X′

17 18
So now we can take any grammar and eliminate left-recursion For next time
(in three steps), making it suitable for top-down parsing (with
backtracking!). Read 4.4.

Notice that this works even for ambiguous grammars.

Next time we’ll define the main component of a top-down


parser — the parsing table.

But practically speaking, we would also like to avoid


backtracking.

Next time we’ll see how this can be done for top-down parsing.

We’ll define the class of LL(1) grammars, suitable for


predictive parsing.

19 20

You might also like