Professional Documents
Culture Documents
1 / 23
More CFLs
i j k
• A = {a b c ∣ i ≤ j or i = k}
• B = {w ∣ w ∈ {a, b, c} contains the same number of as as bs and cs combined}
∗
m n m+n
• C = {1 +1 =1 ∣ m, n ≥ 1}; Σ = {1, +, =}
• D = (abb ∣ bbaa)
∗ ∗
R
• E = {w ∣ w ∈ {0, 1} and w
∗
is a binary number not divisible by 5}
2 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
3 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
3 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
3 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
3 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
r0 ⇒ w1 r1 ⇒ w1 w2 r2 ⇒ ⋯ ⇒ w1 w2 ⋯wn rn
3 / 23
Another proof that regular languages are context-free
We can encode the computation of a DFA on a string using a CFG
r0 ⇒ w1 r1 ⇒ w1 w2 r2 ⇒ ⋯ ⇒ w1 w2 ⋯wn rn
So G has derived the string wrn but this still has a variable
3 / 23
Formally
Proof.
Given a DFA M = (Q, Σ, δ, q0 , F ), we can construct an equivalent CFG
G = (V, Σ, R, S) where
V =Q
S = q0
R = {q → tr ∶ δ(q, t) = r} ∪ {q → ε ∶ q ∈ F }
4 / 23
Formally
Proof.
Given a DFA M = (Q, Σ, δ, q0 , F ), we can construct an equivalent CFG
G = (V, Σ, R, S) where
V =Q
S = q0
R = {q → tr ∶ δ(q, t) = r} ∪ {q → ε ∶ q ∈ F }
∗
By construction r0 ⇒ w1 r1 ⇒ w1 w2 r2 ⇒ w1 w2 ⋯wn rn
∗
Therefore, w ∈ L(M ) iff rn ∈ F iff rn ⇒ ε iff q0 ⇒ w iff w ∈ L(G)
4 / 23
Returning to our language
R
E = {w ∣ w ∈ {0,1} and w
∗
is a binary number not divisible by 5}
0 Q1
1 0
Q0 1 Q2
0
1
1
Q4 Q3
1 0
5 / 23
Returning to our language
R
E = {w ∣ w ∈ {0,1} and w
∗
is a binary number not divisible by 5}
Q1
0
1 0 Q0 → 0Q0 ∣ 1Q2
Q1 → 0Q3 ∣ 1Q0 ∣ ε
Q0 1 Q2 Q2 → 0Q1 ∣ 1Q3 ∣ ε
0 Q3 → 0Q4 ∣ 1Q1 ∣ ε
1
Q4 → 0Q2 ∣ 1Q4 ∣ ε
1
Q4 Q3
1 0
5 / 23
Chomsky Normal Form (CNF)
A CFG G = (V, Σ, R, S) is in Chomsky Normal Form if all rules have one of these
forms
• S→ε where S is the start variable
• A → BC where A ∈ V and B, C ∈ V ∖ {S}
• A→t where A ∈ V and t ∈ Σ
Note
• The only rule with ε on the right has the start variable on the left
• The start variable doesn’t appear on the right hand side of any rule
6 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S
T → AU ∣ BV ∣ a ∣ b
U → TA
V → TB
A→a
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b
U → TA
V → TB
A→a
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA
V → TB
A→a
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB
A→a
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a ⇒ baU B
B→b
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a ⇒ baU B
B→b ⇒ baT AB
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a ⇒ baU B
B→b ⇒ baT AB
⇒ baaAB
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a ⇒ baU B
B→b ⇒ baT AB
⇒ baaAB
⇒ baaaB
7 / 23
CNF example
R
Let A = {w ∣ w ∈ {a, b} and w = w }.
∗
S → AU ∣ BV ∣ a ∣ b ∣ ε S ⇒ BV
T → AU ∣ BV ∣ a ∣ b ⇒ bV
U → TA ⇒ bT B
V → TB ⇒ bAU B
A→a ⇒ baU B
B→b ⇒ baT AB
⇒ baaAB
⇒ baaaB
⇒ baaab
7 / 23
Converting to CNF
Theorem
Every context-free language A is generated by some CFG in CNF.
Proof.
Given a CFG G = (V, Σ, R, S) generating A, we construct a new CFG
G = (V , Σ, R , S ) in CNF generating A.
′ ′ ′ ′
8 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
• B → AA. Add rule B → A and if B → ε has not already been
removed, add it
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
• B → AA. Add rule B → A and if B → ε has not already been
removed, add it
• B → xA or B → Ax. Add rule B → x
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
• B → AA. Add rule B → A and if B → ε has not already been
removed, add it
• B → xA or B → Ax. Add rule B → x
UNIT For each rule A → B, remove it and add rules A → u for each B → u
unless A → u is a unit rule already removed
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
• B → AA. Add rule B → A and if B → ε has not already been
removed, add it
• B → xA or B → Ax. Add rule B → x
UNIT For each rule A → B, remove it and add rules A → u for each B → u
unless A → u is a unit rule already removed
TERM For each t ∈ Σ, add a new variable T and a rule T → t; replace each t in
the RHS of nonunit rules with T
9 / 23
Proof continued
In the following x ∈ V ∪ Σ and u ∈ (Σ ∪ V )
+
′ ′
START Add a new start variable S and a rule S → S
BIN Replace each rule A → xu with the rules A → xA1 and A1 → u and
repeat until the RHS of every rule has length at most two
′
DEL-ε For each rule of the form A → ε other than S → ε remove A → ε and
update all rules with A in the RHS
• B → A. Add rule B → ε unless B → ε has already been removed
• B → AA. Add rule B → A and if B → ε has not already been
removed, add it
• B → xA or B → Ax. Add rule B → x
UNIT For each rule A → B, remove it and add rules A → u for each B → u
unless A → u is a unit rule already removed
TERM For each t ∈ Σ, add a new variable T and a rule T → t; replace each t in
the RHS of nonunit rules with T
Each of the five steps preserves the language generated by the grammar so
L(G ) = A.
′
9 / 23
Example
Convert to CNF
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
START:
10 / 23
Example
Convert to CNF
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
START:
S→A
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
10 / 23
Example
Convert to CNF
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
START:
S→A
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
10 / 23
Example
Convert to CNF
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
START:
S→A
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε:
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
START:
S→A
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε:
A → BAB ∣ B ∣ ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B
B → 00 ∣ ε
START:
S→A A1 → AB ∣ B
A → BAB ∣ B ∣ ε
B → 00 ∣ ε
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε:
A → BAB ∣ B ∣ ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B
B → 00 ∣ ε
START:
S→A A1 → AB ∣ B
A → BAB ∣ B ∣ ε Remove B → ε:
B → 00 ∣ ε
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε:
A → BAB ∣ B ∣ ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B
B → 00 ∣ ε
START:
S→A A1 → AB ∣ B
A → BAB ∣ B ∣ ε Remove B → ε:
B → 00 ∣ ε S→A∣ε
A → BA1 ∣ B ∣ A1
BIN: Replace A → BAB:
S→A B → 00
A → BA1 ∣ B ∣ ε A1 → AB ∣ B ∣ A ∣ ε
B → 00 ∣ ε Don’t add A → ε because we
A1 → AB already removed it
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε: Remove A1 → ε:
A → BAB ∣ B ∣ ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B
B → 00 ∣ ε
START:
S→A A1 → AB ∣ B
A → BAB ∣ B ∣ ε Remove B → ε:
B → 00 ∣ ε S→A∣ε
A → BA1 ∣ B ∣ A1
BIN: Replace A → BAB:
S→A B → 00
A → BA1 ∣ B ∣ ε A1 → AB ∣ B ∣ A ∣ ε
B → 00 ∣ ε Don’t add A → ε because we
A1 → AB already removed it
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε: Remove A1 → ε:
A → BAB ∣ B ∣ ε S→A∣ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B A → BA1 ∣ B ∣ A1
B → 00 ∣ ε B → 00
START:
S→A A1 → AB ∣ B A1 → AB ∣ B ∣ A
A → BAB ∣ B ∣ ε Remove B → ε: Don’t add A → ε because we
B → 00 ∣ ε S→A∣ε already removed it
A → BA1 ∣ B ∣ A1
BIN: Replace A → BAB:
S→A B → 00
A → BA1 ∣ B ∣ ε A1 → AB ∣ B ∣ A ∣ ε
B → 00 ∣ ε Don’t add A → ε because we
A1 → AB already removed it
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε: Remove A1 → ε:
A → BAB ∣ B ∣ ε S→A∣ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B A → BA1 ∣ B ∣ A1
B → 00 ∣ ε B → 00
START:
S→A A1 → AB ∣ B A1 → AB ∣ B ∣ A
A → BAB ∣ B ∣ ε Remove B → ε: Don’t add A → ε because we
B → 00 ∣ ε S→A∣ε already removed it
A → BA1 ∣ B ∣ A1
BIN: Replace A → BAB: UNIT: Remove S → A
S→A B → 00
A → BA1 ∣ B ∣ ε A1 → AB ∣ B ∣ A ∣ ε
B → 00 ∣ ε Don’t add A → ε because we
A1 → AB already removed it
10 / 23
Example
Convert to CNF DEL-ε: Remove A → ε: Remove A1 → ε:
A → BAB ∣ B ∣ ε S→A∣ε S→A∣ε
B → 00 ∣ ε A → BA1 ∣ B A → BA1 ∣ B ∣ A1
B → 00 ∣ ε B → 00
START:
S→A A1 → AB ∣ B A1 → AB ∣ B ∣ A
A → BAB ∣ B ∣ ε Remove B → ε: Don’t add A → ε because we
B → 00 ∣ ε S→A∣ε already removed it
A → BA1 ∣ B ∣ A1
BIN: Replace A → BAB: UNIT: Remove S → A
B → 00
S→A S → BA1 ∣ B ∣ A1 ∣ ε
A → BA1 ∣ B ∣ ε A1 → AB ∣ B ∣ A ∣ ε
A → BA1 ∣ B ∣ A1
B → 00 ∣ ε Don’t add A → ε because we B → 00
A1 → AB already removed it A1 → AB ∣ B ∣ A
10 / 23
Example continued
From previous slide
S → BA1 ∣ B ∣ A1 ∣ ε
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide
S → BA1 ∣ B ∣ A1 ∣ ε
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
Remove S → B
11 / 23
Example continued
From previous slide
S → BA1 ∣ B ∣ A1 ∣ ε
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
Remove S → B
S → BA1 ∣ A1 ∣ ε ∣ 00
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1
S → BA1 ∣ B ∣ A1 ∣ ε
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
Remove S → B
S → BA1 ∣ A1 ∣ ε ∣ 00
A → BA1 ∣ B ∣ A1
B → 00
A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1
B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1
B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1
B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1 Remove A → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1
B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1 Remove A → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1 A → BA1 ∣ 00 ∣ AB
B → 00 B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1 Remove A → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1 A → BA1 ∣ 00 ∣ AB
B → 00 B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
11 / 23
Example continued
From previous slide Remove S → A1 Remove A → A1
S → BA1 ∣ B ∣ A1 ∣ ε S → BA1 ∣ ε ∣ 00 ∣ AB S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ B ∣ A1 A → BA1 ∣ B ∣ A1 A → BA1 ∣ 00 ∣ AB
B → 00 B → 00 B → 00
A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A A1 → AB ∣ B ∣ A
Remove A1 → A
12 / 23
Example continued
Copied from the previous slide
S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ 00 ∣ AB
B → 00
A1 → AB ∣ A ∣ 00
Remove A1 → A
S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ 00 ∣ AB
B → 00
A1 → AB ∣ 00 ∣ BA1
12 / 23
Example continued
Copied from the previous slide TERM: Add Z → 0
S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ 00 ∣ AB
B → 00
A1 → AB ∣ A ∣ 00
Remove A1 → A
S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ 00 ∣ AB
B → 00
A1 → AB ∣ 00 ∣ BA1
12 / 23
Example continued
Copied from the previous slide TERM: Add Z → 0
S → BA1 ∣ ε ∣ 00 ∣ AB S → BA1 ∣ ε ∣ ZZ ∣ AB
A → BA1 ∣ 00 ∣ AB A → BA1 ∣ ZZ ∣ AB
B → 00 B → ZZ
A1 → AB ∣ A ∣ 00 A1 → AB ∣ ZZ ∣ BA1
Z→0
Remove A1 → A
S → BA1 ∣ ε ∣ 00 ∣ AB
A → BA1 ∣ 00 ∣ AB
B → 00
A1 → AB ∣ 00 ∣ BA1
12 / 23
Caution
Sipser gives a different procedure
1 START
2 DEL-ε
3 UNIT
4 BIN
5 TERM
This procedure works but can lead to an exponential blow up in the number of rules!
2∣G∣
In general, if DEL-ε comes before BIN, then ∣G ∣ is O(2 );
′
2
if BIN comes before DEL-ε, then ∣G ∣ is O(∣G∣ )
′
So use whichever procedure you’d like, but Sipser’s can be very bad
(Sipser’s is bad if you have long rules with lots of variables with ε-rules)
13 / 23
Example blow up
A → BCDEEDCB ∣ CBEDDEBC
B→0∣ε
C→1∣ε
D→2∣ε
E→3∣ε
Converting using START, BIN, DEL-ε, UNIT, TERM gives a CFG with 18 variables
and 125 rules
14 / 23
Example blow up
A → BCDEEDCB ∣ CBEDDEBC
B→0∣ε
C→1∣ε
D→2∣ε
E→3∣ε
Converting using START, BIN, DEL-ε, UNIT, TERM gives a CFG with 18 variables
and 125 rules
Converting using START, DEL-ε, UNIT, BIN, TERM gives a CFG with 1394 variables
and 1953 rules
14 / 23
Prefix
Recall Prefix(L) = {w ∣ for some x ∈ Σ , wx ∈ L}
∗
Theorem
The class of context-free languages is closed under Prefix.
15 / 23
Prefix
Recall Prefix(L) = {w ∣ for some x ∈ Σ , wx ∈ L}
∗
Theorem
The class of context-free languages is closed under Prefix.
Proof idea
R
Consider the language {w#w ∣ w ∈ {a, b} } generated by
∗
T → aT a ∣ bT b ∣ #
15 / 23
Prefix
Recall Prefix(L) = {w ∣ for some x ∈ Σ , wx ∈ L}
∗
Theorem
The class of context-free languages is closed under Prefix.
Proof idea
R
Consider the language {w#w ∣ w ∈ {a, b} } generated by
∗
T → aT a ∣ bT b ∣ #
Let’s convert to CNF
15 / 23
Prefix
Recall Prefix(L) = {w ∣ for some x ∈ Σ , wx ∈ L}
∗
Theorem
The class of context-free languages is closed under Prefix.
Proof idea
R
Consider the language {w#w ∣ w ∈ {a, b} } generated by
∗
T → aT a ∣ bT b ∣ #
Let’s convert to CNF
S → AU ∣ BV ∣ #
T → AU ∣ BV ∣ #
U → TA
V → TB
A→a
B→b
15 / 23
Derivation of ab#ba
S
S ⇒ AU
⇒ aU
⇒ aT A A U
⇒ aBV A
⇒ abV A a T A
⇒ abT BA
⇒ ab#BA
⇒ ab#bA B V a
⇒ ab#ba
We want to construct a CFG that keeps track of whether a given variable in the
derivation is
L left of the split,
S part of the split, or
R right of the split
We can construct a new CFG whose variables are ⟨A, L⟩, ⟨A, S⟩, or ⟨A, R⟩ where A is
a variable in the original CFG
′ ′
Now we just need to specify R . We’ll start with R = ∅ and add rules to it
19 / 23
Proof continued
Since L is nonempty, ε ∈ Prefix(L) so add the rule ⟨S, S⟩ → ε to R
′
′
For each rule of the form A → BC in R, add the following rules to R
⟨A, L⟩ → ⟨B, L⟩⟨C, L⟩ left of the split
⟨A, S⟩ → ⟨B, L⟩⟨C, S⟩ ∣ ⟨B, S⟩⟨C, R⟩ one of B or C is on the split
⟨A, R⟩ → ⟨B, R⟩⟨C, R⟩ right of the split
′
For each rule of the form A → t in R, add the following rules to R
⟨A, L⟩ → t
⟨A, S⟩ → t
⟨A, R⟩ → ε
20 / 23
Proof continued
∗
For each w = w1 w2 ⋯wn ∈ L, S ⇒ A1 A2 ⋯An where Ai ⇒ wi
By construction,
∗
⟨S, S⟩ ⇒ ⟨A1 , L⟩⋯⟨Ai−1 , L⟩⟨Ai , S⟩⟨Ai+1 , R⟩⋯⟨An , R⟩
∗
⇒ w1 w2 ⋯wi
for each 1 ≤ i ≤ n
′
I.e., G derives the prefix of every string in L
′
A similar argument works to show that if G derives a string then it’s a prefix of some
string in L
21 / 23
Applying the construction
Deriving ab# ⟨S, S⟩
⟨S, S⟩ ⇒ ⟨A, L⟩⟨U, S⟩
⇒ a⟨U, S⟩
⇒ a⟨T, S⟩⟨A, R⟩ ⟨A, L⟩ ⟨U, S⟩
⇒ a⟨B, L⟩⟨V, S⟩⟨A, R⟩
⇒ ab⟨V, S⟩⟨A, R⟩
a ⟨T, S⟩ ⟨A, R⟩
⇒ ab⟨T, S⟩⟨BA, R⟩
⇒ ab#⟨B, R⟩⟨A, R⟩
⇒ ab#⟨A, R⟩ ⟨B, L⟩ ⟨V, S⟩ ε
⇒ ab#
b ⟨T, S⟩⟨B, R⟩
# ε
22 / 23
Similarities with regular expression
Proving things about
• Regular languages. Assume there exists a regular expression that generates the
language and consider the six cases
• Context-free languages. Assume there exists a CFG that generates the language
and consider the three types of rules
23 / 23