You are on page 1of 43

Hierarchy of the Models

1
12. Hierarchy of the Models: Proper Containment
In Chapter 7 we saw the Chomsky hierarchy among languages, automata, and other models and proved the horizontal
relations (i.e., the characterization) at the lowest level of the hierarchy. Figures (a) and (b), respectively, are the copies of the
relations among the classes of languages and automata presented in Chapter 7 (see next slide). Recall that if a language is a
member of a class of languages at a lower level, then it is also a member of an upper class. If a language is recognizable by an
automaton at a lower level, then it is also recognizable by an automaton at an upper level. But the reverse of these relations does
not hold. An upper lever language class has a member that does not belong to a lower level class, and there is an automaton
belonging to an upper level whose language cannot be recognized by any automaton belonging to a lower level of the hierarchy.

TM LBA
Type-0L Type-3L
PDA
Type-1L Type-2L
FA

(a) Containment relations of (b) Containment relations of


the language classes automata capability

2
Hierarchy: Proper Containment
Review
The Chomsky Hierarchy
Languages (grammars) Machines Other Models

Recursively Turing Post System,


Machines
Enumerable (TM) Markov Algorithms,
Sets (type 0) µ -recursive Functions

Linear-bounded
Context-sensitive Automata .
Languages(type 1) (LBA) .
.
.
Context-free Pushdown ↓ : Containment
.
Languages(type 2) Automata
(PDA) ↔ : Characterization

Regular Finite State Regular


Languages(type3) Automata Expression
(FA)
3
Hierarchy: Proper Containment

To prove the proper containment relations of the hierarchy it requires in-depth logical arguments that we have developed in
Chapter 1. In this chapter we will prove the containment relations up to the class of context-sensitive languages. In Chapter 15
we will complete the proof of the hierarchy by proving the characterizations of context-free, context-sensitive and phrase
structured languages, together with the proof of the last proper containment relation between type 0 and type 1. The logics
involved in these proofs are so elegant that it worth a challenge.

12.1 Relationship of the language classes: proper containment 312


12.2 The pumping lemma 316
The pumping lemma and proof
12.3 Applications of the pumping lemma 323
Examples
12.4 The pumping lemma for context-free languages 328
12.5 Application of the pumping lemma for context-free languages 331
12.6 Ogden’s lemma 336
The lemma and an application
12.7 A proof of the pumping lemma for context-free languages 339
Rumination 346
Exercises 350

4
Hierarchy: Proper Containment
12.1 The containment relations
Before the proof of the proper containment relations (⊃) between two levels of the
hierarchy, we will prove the containment relations (⊇) by the following theorem.

Theorem 12.1 Let Type-iL denote type i (i = 0, 1, 2, 3) language class. Then for all i
< 3, the following relation holds.
Type-iL ⊇ Type-(i+1)L

Proof. Let α →β be a rule of a grammar G = (VT, VN, P, S). We know that by


definition, type 3 (regular) grammar rules should have the form (i.e., |α | = 1) of type
2 (context-free) grammar rules with additional restriction that either β = xB or β = x
, where x ∈ (VT)* and B ∈ VN. It follows that every type 3 grammar is a type 2
grammar, and hence, all type 3 languages belong to the class of type 2 languages, i.e.,
Type-2L ⊇ Type-3L.

5
Containment Relations Hierarchy: Proper Containment

Proof (continued). By definition, type 1 grammar (CSG) rules are type 0 grammar
rules with the additional restriction that the left side of each rule cannot be longer
than the right side. It follows that every type 1 (context-sensitive) grammar is also a
type 0 (phrase structured) grammar. That is,
Type-0L ⊇ Type-1L
Now, we prove that type 1 language class contains type 2 (context-free)
language class. Recall that type 1 grammar rules are non-contracting. In other
words, type 1 grammar rules are type 0 grammar rules with the restriction that for
any rule α →β , the left side cannot be longer than the right side, i.e., |α | ≤ |β |.
Type 2 grammar rules are type 0 grammar rules having one nonterminal symbol
on their left side, i.e., |α | = 1.
Both type 1 and type 2 grammars are defined by adding a restriction on type 0
grammar rules. So for the proof of the containment relation between type 1 and
type 2 language classes, we cannot apply the simple logic that we used for the
other relations between other language classes.

6
Containment Relations Hierarchy: Proper Containment

Clearly, since every CFG rule has exactly one nonterminal symbol on its
left side, as far as the left side of grammar rules are concerned, CFG rules
belong to CSG rules. The only problem is the ε -production rules allowed in
CFG’s, which violate the requirement that the right side of CSG rules cannot be
shorter than the left side, except for S → ε . All the other CFG rules, which do not
produce ε , are CSG rules. Fortunately, we have a way to overcome this problem.
Recall Theorem 10.1 repeated below.

Theorem 10.1 Given a CFG G, we can construct a CFG G' that satisfies the following
conditions.
• L(G) = L(G').
• If ε ∉ L(G), G' has no ε -production rule. Otherwise, S → ε is the only ε -
production rule in G'.

Grammar G' is a CSG because every rule in G' is non-contracting except for the
rule S → ε , if any. This implies that every CFL can be generated by a CSG. It
follows that Type-1L ⊇ Type-2L.

7
Containment Relations Hierarchy: Proper Containment

In summary, we showed the following relations.


Type-0L ⊇ Type-1L ⊇ Type-2L ⊇ Type-3L
Now, we will show that all these containment relations are proper, i.e.,
Type-0L ⊃ Type-1L ⊃ Type-2L ⊃ Type-3L
For the proof it is enough to show that for each i = 0, 1, 2, there is a language in class
Type-iL that does not belong to class Type-(i+1)L. In this chapter we will first show
that the familiar type 2 language {aibi | i ≥ 0} does not belong to Type-3L, i.e., the class
of regular languages. Then we will show that the language {aibici | i ≥ 0 }, which was
presented as a typical type 1 language in Chapter 2, does not belong to Type-2L. These
two results prove the following relations.
Type-1L ⊃ Type-2L ⊃ Type-3L
We will defer the proof of the last part Type-0L ⊃ Type-1L until Chapter 15.

8
Hierarchy: Proper Containment
12.2 The Pumping Lemma
To prove the relation Type-2L ⊃ Type-3L, it is enough to find a context-free
language that is not regular. Our approach is to first find a common property of all
regular languages, and then show a context-free language that does not satisfy this
property. Since every finite language is regular (we leave the proof of this claim for
the reader), the property we are looking for must be found among infinite regular
languages. The following lemma presents such property.

Pumping Lemma: For every infinite regular language L, there is a constant integer
n that satisfies the following conditions:

Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is
greater than or equal to n, there are u, v, w ∈ ∑* such that
• z = uvw, |uv| ≤ n, |v| ≥ 1, and
• for all i ≥ 0, uviw ∈ L

9
Pumping Lemma Hierarchy: Proper Containment

Proof: For the proof we need the following two terminologies; a transition path is
a directed path on a state transition graph and a path label is the string constructed
by picking the input symbols along a transition path. For example, in the state
transition graph shown below, aabba is the path label on the state transition path 1
→ 2 → 4 → 3 → 2 → 4. Given the string aabba as an input, the DFA will take a
sequence of transitions along the transition path 1 → 2 → 4 → 3 → 2 → 4.
To help the reader understand the argument, we will use the following state
transition graph throughout the proof, together with other figures when needed.

a 2 a
a b 4
start 11
33
b
b

10
Pumping Lemma Hierarchy: Proper Containment

Given an infinite regular language L, let M be a DFA that recognizes L and let n
be the number of states of this automaton. For an input string z ∈ L, whose length is
greater than or equal to n (i.e., |z| ≥ n), examine the state transition path with path
label z.
Notice that because M is deterministic, there is only one transition path with path
label z. Since L is infinite and the number of states n is finite, there must be a string
z in L such that |z| ≥ n. Otherwise, L cannot be infinite.
Let m be the length of z (i.e., |z| = m ≥ n ) and let z = a1a2 …am.

a 2 a
a b 4
start 11
33
b
b

11
Pumping Lemma Hierarchy: Proper Containment

Since z ∈ L, the transition path with label z must end in an accepting state.
Because |z| ≥ n, the path involves a cycle and hence, there should be a state,
say q, visited a second time by reading some j-th symbol aj (j ≤ n) in z as
illustrated in figure (b) below. (This can be proved by the pigeonhole
principle. Refer the application example of the pigeonhole principle in
Chapter 1.)
|z| = m ≥ n
a 2 a
a b aj+1 .
4 ai .
start 11 . q . a
33
a1 .. m
b a2 ai+1 aj
start
b
M

(a) Transition path with path (b) Transition path with path label
label aabba z = a1a2 …am, m ≥ n

12
Pumping Lemma Hierarchy: Proper Containment

As shown in figure (b), let v be the path label of the cyclic transition path, and let
u and w be, respectively, the prefix and the suffix of z partitioned by v. (If q is the
start state, then u = ε , and if q is the last state in the path, w = ε .) Since the cycle
involves at least one edge, |v| ≥ 1. Since j ≤ n, we have |uv| ≤ n.

z = a1a2 …ai ai+1 . . . . aj aj+1 . . . .am


u v w
u = a, v = abb, w = a
|z| = m ≥ n |uv| ≤ n |v| ≥ 1

a 2 a
aj+1 .
a b 4 ai .
11 . q . a
start a1 .. aj m
33 a2 ai+1 w
b start
u
b v
M
(a) Transition path with path
label aabba (b) Transition path with path label
z = a1a2 …am, m ≥ n
13
Pumping Lemma Hierarchy: Proper Containment
Sine uvw is in L, M accepts uvw. It follows that all the following strings
corresponding to the path labels of the transition paths with the cycle repeated
zero or more times are also accepted by M.
uw = uv0w, uvvw = uv2w, uvvvw = uv3w, . . . .
That is, for all i ≥ 0, uviw ∈ L. (Refer to figure (a) for an example.)
z = a1a2 …ai ai+1 . . . . aj aj+1 . . . .am
u v w
u = a, v = abb, w = a
|z| = m ≥ n |uv| ≤ n |v| ≥ 1
For all i ≥ 0, a(abb)ia ∈ L
For all i ≥ 0, uviw ∈ L
a 2 a aj+1 .
ai .
a b . q . a
4 a1 .. aj m
start 11 a2 ai+1 w
start
33 u
b v
M
b
(a) Transition path with path (b) Transition path with path label
label aabba z = a1a2 …am, m ≥ n
14
Pumping Lemma Hierarchy: Proper Containment

Summarizing the observations, we get the pumping lemma repeated here.

Pumping Lemma: For every infinite regular language L, there is a constant integer
n which satisfies the following conditions:

Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length
is greater than or equal to n, there are u, v, w ∈ ∑* such that

• z = uvw, |uv| ≤ n, |v| ≥ 1, and


• for all i ≥ 0, uviw ∈ L

Job Applicant Bloopers Break Time

Interviewer: “Do you think you can handle a variety of tasks?”


Applicant: “I should say so. I’ve had nine totally different jobs in the past five months.”
- Jim -

15
Hierarchy: Proper Containment
12.3 Application of the Pumping Lemma
Now, as an application of the pumping lemma we will show that there is a CFL
which is not regular. Specifically, using the lemma, we will prove that the language
{aibi | i ≥ 1} does not satisfy the pumping lemma to show that Type-2L ⊃ Type-3L,
the proper containment relation between the classes of type 2 and type 3 in the
Chomsky hierarchy.
For the convenience of the application, the lemma is rewritten below to expose its
logical structure. (Notice the quantified parts highlighted.)

(1) For every infinite regular language L, // ∀ infinite regular language L,


(2) there exists a constant n such that // ∃ a constant n such that . . .
(3) for every string z ∈ L, such that | z | ≥ n, // ∀ string z ∈ L, | z | ≥ n
(4) there exist strings u, v, w ∈ ∑* such that // ∃ u, v, w that satisfy . . .
(i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1,
(iv) for all i ≥ 0, uviw ∈ L. // ∀ i ≥ 0, uviw ∈ L.

16
Application of the Pumping Lemma Hierarchy: Proper Containment

Before using the pumping lemma for the proof, recall the techniques (in Section
1.3) for dealing with existential quantification (∃ ) and universal quantification (∀) in
a statement to prove. In this example, we are going to prove that for the given
context-free language, the pumping lemma is not true. Hence, for the existentially
quantified parts of the lemma, i.e., the constant n and strings u, v, and w, we should
consider all possible cases. For the universally quantified parts, i.e., regular language
L, string z ∈ L, and integer i ≥ 0, it is enough to consider a single case, which can be
chosen for our convenience of the proof.
The argument of the proof will run as a game between the proponent of the lemma
and us who are going to show that the lemma does not hold for the given language.

(1) For every infinite regular language L, // ∀ infinite regular language


L,
(2) there exists a constant n such that // ∃ a constant n such that . . .
(3) for every string z ∈ L, such that | z | ≥ n, // ∀ string z ∈ L, | z | ≥
n
(4) there exist strings u, v, w ∈ ∑* such that // ∃ u, v, w that satisfy . . .
(i) z = uvw, (ii) |uv| ≤ n, (iii) |v| ≥ 1,
(iv) for all i ≥ 0, uviw ∈ L. // ∀ i ≥ 0, uviw ∈ L. 17
Application of the Pumping Lemma Hierarchy: Proper Containment

Now, we show that our familiar CFL L = {aibi | i ≥ 1} does not satisfy the pumping
lemma, and hence, it is not regular. Suppose that L is a regular language. (By this
sentence the reader will promptly recognize that we are going to use the proof-by-
contradiction technique.) We will carefully analyze the pumping lemma to find a
contradicting part against our supposition. (The numbers in the following arguments
match the ones that we put in the lemma.)

Pumping lemma says Our argument

(1) For every infinite regular L is infinite regular language. Hence,


language L, there should be a constant n such that
(2) there exists a constant n s.t. conditions (3) and (4) are satisfied.
(3) . . . . . Let n be just that constant. Using n as a
(4) . . . . variable, we are going to consider all
possible constant values of n in our
argument.

18
Application of the Pumping Lemma Hierarchy: Proper Containment

Pumping lemma says Our argument


(3) For all strings z ∈ L such that We choose string z = anbn in L, and examine
|z|≥ n, whether for all strings u, v, w satisfying
(4) there exist strings u, v and w conditions (i) – (iii), the condition (iv) is
that satisfy the following . satisfied.
(i) z = uvw, Let z = anbn = uvw according to (i). Then
(ii) |uv| ≤ n, by (ii) and (iii), v contains only (and at least
(iii) |v| ≥ 1, one) a’s. (See the figure at the bottom left.)
(iv) for all i ≥ 0, uviw ∈ L. It follows that string uv2w contains more a’s
than b’s. That is, for i = 2, we claim that
n n uv2w ∉ L.
The language L does not satisfy the
aa . . . . . . . .a bb . . . . . . . . .b
pumping lemma. This contradicts the
u v w
assumption that L is regular. Therefore, L is
|uv| ≤ n
not regular.

19
Application of the Pumping Lemma Hierarchy: Proper Containment

We know that L = {aibi | i ≥ 1 } is a CFL. Since we have just shown that L


is not regular, the containment relation Type-2L ⊇ Type-3L that we have
established in the previous section can be refined to the proper containment
relation, i.e., Type-2L ⊃ Type-3L.

Secrets of Success Break Time

The other day I had the opportunity to drop by my department head's office. He's a friendly guy and on the rare
opportunities that I have to pay him a visit, we have had enjoyable conversations. While I was in his office
yesterday I asked him, "Sir, What is the secret of your success?"
He said, "Two words."
"And, Sir, what are they?"
"Right decisions."
"But how do you make right decisions?"
"One word." He responded.
"And, sir, What is that?"
"Experience."
"And how do you get Experience?"
"Two words."
"And, Sir, what are they?"
"Wrong decisions."
- Rubin -

20
Hierarchy: Proper Containment
12.4 The pumping lemma for context-free
languages
Now, we prove the proper containment relation between the next upper two
levels of the language classes, i.e., Type-1L ⊃ Type-2L. We use the same approach
that we took for proving the containment relation Type-2L ⊃ Type-3L. With no
proof we first present a common property of infinite context-free languages, called
the pumping lemma for context-free languages (see below). Then, as an application
of the lemma, we will show that the context-sensitive language {aibici | i ≥ 1 },
which we introduced in Section 2.3, does not satisfy the lemma. We shall present
the proof of the lemma after showing the application.
The pumping lemma for context-free languages. For every infinite CFL L, there is
a constant integer p that satisfies the following conditions:

Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is
greater than or equal to p, there are u, v, w ∈ ∑* such that
• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1
• for all i ≥ 0, uviwxiy ∈ L

21
Pumping Lemma for CFL’s Hierarchy: Proper Containment

The pumping lemma for CFL’s looks very similar to the one for regular languages.
The figure below shows the difference. Both lemmas claim the existence of a constant
(i.e, n for regular languages and p for CFL’s). The lemma for regular languages divides
string z into three segments (i.e., z = uvw), while the lemma for CFL’s divides it into
five segments (i.e., z = uvwxy). The lemma for regular languages provides one site,
named v, to pump (see the up arrow), while the lemma for CFL’s provides two sites v
and x to pump simultaneously.

The lemma for CFL’s The lemma for regular language

• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • z = uvw, |uv| ≤ n, |v| ≥ 1


• for all i ≥ 0, uviwxiy ∈ L • for all i ≥ 0, uviw ∈ L

u v w x y u v w
z

vwx| ≤ p |uv| ≤ n

22
Pumping Lemma for CFL’s Hierarchy: Proper Containment

Recall that when we applied the pumping lemma for regular languages, we took
advantage of the range of the pump site v that is restricted within the prefix of length
n of string z, given by the condition |uv| ≤ n.
Unfortunately, for the pumping lemma for CFL’s the segment vwx, which contains
the two pump sites, can be flanked by u to the left and by y to the right. So the pump
sites can be located anywhere in z in the window of width |vwx| ≤ p. When we apply
the lemma to prove a language is not context-free, we must consider all possible
locations of the window vwx, from the left end of z up to the right end.
The lemma for CFL’s The lemma for regular language

• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1 • z = uvw, |uv| ≤ n, |v| ≥ 1


• for all i ≥ 0, uviwxiy ∈ L • for all i ≥ 0, uviw ∈ L

u v w x y u v w
z

vwx| ≤ p |uv| ≤ n

23
Hierarchy: Proper Containment

12.5 Application of the pumping lemma for CFL’s


Again, for the clarity of the argument, we rewrite the lemma as follows.

(1) For every infinite CFL L, // ∀ infinite CFL L,


(2) there exists a constant p such that // ∃ a constant p such that . . .
(3) for every string z ∈ L, such that | z | ≥ p, // ∀ string z ∈ L, | z | ≥ p
(4) there exist strings u, v, w, x and y that // ∃ u, v, w, x, y that . . .
satisfy the following conditions.
(i) z = uvwxy,
(ii) |vwx| ≤ p,
(iii) |vx| ≥ 1,
(iv) for all i ≥ 0, uviwxiy ∈ L. // ∀ i ≥ 0, uviwxiy ∈ L.

24
Application of CFL Pumping Lemma Hierarchy: Proper Containment

Now, we show that the CSL L = {aibici | i ≥ 1}, that we have introduced in
Section 2.3, does not satisfy the pumping lemma for CFL’s. Our argument is similar
to the one for proving that language {aibi | i ≥ 1} is not regular.
Suppose L is a CFL. It should satisfy the lemma. With L we walk along the lemma
and come out with a contradiction.

Pumping Lemma’s claim Our claim

(1) For every infinite CFL L, L is a infinite CFL. Hence, there should
(2) there exists a constant p s.t. be a constant p such that conditions (3)
and (4) are satisfied.
(3) . . . . .
(4) . . . . Let p be the constant. Using p as a
variable, we are going to consider all
possible constant values of p in our
argument.

25
Application of CFL Pumping Lemma Hierarchy: Proper Containment

Pumping Lemma’ claim Our claim

(3) For all strings z ∈ L such that We choose string z = apbpcp in L and
|z|≥ p, examine whether, for all strings u, v, w
(4) there exist strings u, v, w, x, y satisfying conditions (i) – (iii), it satisfies
that satisfy the following. condition (iv).
(i) z = uvw, By (i) let Let z = apbpbp = uvwxy, and
examine what kinds of symbols can be in
(ii) |uv| ≤ n,
the two pump sites v and x.
(iii) |v| ≥ 1, Recall that vwx, whose length ≤ p, can
(iv) for all i ≥ 0, uviw ∈ L. be anywhere in z.

p p p

aa . . . . . . . . .abb . . . . . . . .bcc. . . . . . . . . c
u vwx y

26
Application of CFL Pumping Lemma Hierarchy: Proper Containment

As we can see in the figure below, by condition (ii) |vwx| ≤ p, the pump sites v and
x together can contain no more than two different symbols (a’s and b’s or b’s and c’s).
By condition (iii) |vx| ≤ 1, strings v and x together should contain at least one symbol.
It follows that in uv2wx2y the numbers of a’s, b’s, and c’s cannot be the same, which
implies that uv2wx2y ∉ L. The language does not satisfy condition (iv) of the lemma
when i = 2. Hence, L is not context-free.

(i) z = uvwxy, p p p
(ii) |vwx| ≤ p,
(iii) |vx| ≥ 1, aa . . . . . . . . .abb . . . . . . . .bcc. . . . . . . . . c
(iv) for all i ≥ 0, u vwx y
uviwxiy ∈ L. |vwx| ≤ p

27
Application of CFL Pumping Lemma Hierarchy: Proper Containment

For the language {aibici | i ≥ 1 }, it was fairly straightforward to apply the lemma and
show that the language is not context-free. However, this lemma is so weak that there
are many non-context-free languages for which it is either hard or impossible to apply
it. For example, consider the following language L, which is not context-free.
Whichever string z (of length greater than or equal to p) we pick from this language, it
is impossible to pump and come to a contradiction.

L = {xaix | i > 0, x ∈ {0,1}+} ∪ {x | x ∈ {0,1}+}

p p p p
For example, suppose that we picked string z = 1 0 aaa1 0 to see if z = uvwxy can
be pumped to come up with a contradiction. For the case of v = a, w = a, and x = a, it
p p p p
satisfies that uviwxiy ∈ L, for all i ≥ 0. Even when there is one a (i.e., z = 1 0 a1 0 ),
if we set v = a, and w = x = ε , we get uviwxiy ∈ L, for all i ≥ 0. Notice that picking
a binary string for z from the second part of the language doesn’t work either. Ogden
has introduced a lemma to strengthen this weakness.

28
Hierarchy: Proper Containment
12.6 Ogden’s Lemma
Ogden’s lemma, shown below, is very similar to the original lemma except
for the additional (highlighted) parts restricting the locations of the pump sites.

(1) For every infinite CFL L,


(2) there exists a constant p such that
(3) for every string z ∈ L such that |z| ≥ p, if we mark p symbols in z
(4) there exist strings u, v, w, x and y that
satisfy the following conditions.
(i) z = uvwxy,
(ii) |vwx| ≤ p,
(iii) |vx| ≥ 1, and string vx contains at least one marked symbol,
(iv) for all i ≥ 0, uviwxiy ∈ L.

We omit the proof of this lemma. Interested readers are referred to the book
“Introduction to Formal Languages, Automata Theory, Languages and Computation”
by Hopcroft, Motwani, and Ullman (2001. Addison Wedley).

29
Hierarchy: Proper Containment
Ogden’s Lemma

Now, we can apply Ogden’s lemma and show that the following language is not
CFL, which we failed to prove with the original lemma.

L = {xaix | i > 0, x ∈ {0,1}+} ∪ {x | x ∈ {0,1}+}

Suppose L is a CFL. Then it should satisfy Ogden’s lemma. Let p be the constant.
p p p p p
Pick string z = 1 0 a1 0 , let z = uvwxy, and mark the prefix 1 . According to the
lemma, |vwx| ≤ p, and string vx should contain at least one marked symbol. It
follows that string vx can never include the symbol a in the middle. We see that
string z' = uv2wx2y ∉ L, because in z' the binary string on the left side of a must be
different from the string on the right side of a. We have a contradiction. It follows
that L is not context-free.
By the two application examples of the pumping lemmas for regular languages and
context-free languages, we have proven the relations of proper containment between
the lower two levels of Chomsky hierarchy (repeated below). We will complete the
remaining proofs of other relations in Chapter 15.

30
Review Chomsky Hierarchy
Languages (grammars) Machines Other Models

Recursively Turing Post System,


Machines
Enumerable (TM) Markov Algorithms,
Sets (type 0) µ -recursive Functions

Linear-bounded
Context-sensitive Automata .
Languages(type 1) (LBA) .
.
↓ : Containment
.
↔ : Characterization
.
Context-free Pushdown
Languages(type 2) Automata * Colored arrows indicate
(PDA)
proven relations.

Regular Finite State Regular


Languages(type3) Automata Expression
(FA)
31
Hierarchy: Proper Containment

12.7 Proof of the pumping lemma for CFL’s


In this section, we will present a proof of the lemma with a CFG to support the
argument with some illustrations. (Detailed formal proof is omitted.)
Consider the CFG below, which is in Chomsky normal form (CNF) and has no
ε -production rule. (Recall that every CFL can be generated by a CFG in CNF
with no ε -production rules, except for S → ε in the case when the language
contains ε .)
S → AB A → DE B → FD E → FC
C → DB | c F → HC D →d H →h

Let’s examine the parse tree of the grammar yielding the following string.
d(hd)3hc(d)3dchdchd

32
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment

S
Notice that on the longest trunk, A
there appears a repeating D E
F B
subsequence of nonteminals, d C F
H C D
S-A-E-F-C-B-F-C-B-F-C-B-F-C h D B
D H C
(see the highlighted part). Also B F d
d D
notice that in the yield of this tree F D d H h c
C
d(hd) hc(d) dchdchd,
3 3
H C d d
the highlighted substrings that h D B h c
are produced by the repeating d
F D
nonterminals also repeat. H C d G:
Since every nonterminal has h
D B S AB A DE
two children, it can produce
d D
C DB | c B → FD
“fruits” (i.e., a terminal string) F F HC E FC
either on its left side branches or H C
d
on its right side branches. In the D d H h
h c
example, F, C and B,
respectively, produce h, d and d. yield: d(hd)3hc(d)3dchdchd

33
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment

S
On the longest trunk, F-C-B A
repeats. This repeating segment D E
F B
can be inserted (or deleted), d C F
H C D
proportionally increasing (or h D D B
decreasing) the “fruits” on both B F H C d
d d D
sides of the trunk. F D H h c
C
H C d d
Notice that each repeating
h D B h c
segment F-C-B bears hb to the left
d
and d to the right. F D
H C d G:
For CFG G, it satisfies the h S AB A DE
D B
condition that for all i ≥ 0, C DB | c B → FD
d D
z= F F HC E FC
d
d(hd)ihc(d)idhcdhcd is in the H C D d H h
h c
language L(G).
z = d ( h d )3 h c (d)3 d h c d h c d
u v w x y

34
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment

We can make the trunk of the suffix tree grow by inserting more repeating
subsequences of F-C-B, or cut it short by deleting them. Since the number of repeats
can be arbitrary, there can be infinite number of parse trees. Every repeating trunk
has a repeating pattern of branches to its left or right and every branch should bear
“fruit” (i.e., a terminal symbol). (Recall that there is no ε -production rule except S
→ ε .) Hence, the parse tree can yield a repeating substring either to the left (bh in
the example) or to the right (b in the example) of the trunk. For CFG G , we can
claim that for all i ≥ 0, string d(hd)ihc(d)idchdchd is a yield of a parse tree of G, i.e.,
d(hd)ihc(d)idchdchd ∈ L(G).

We can generalize the above observation as follows. Let G be a CFG in Chomsky


normal form whose language is infinite. Consider a string z in L(G) whose length is
long enough (i.e., some constant which is proportional to the size of the nonterminal
alphabet) such that the parse tree yielding z has a trunk where a nonterminal symbol
A repeats.

35
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment

Let Aα Aβ be a pattern appearing on a trunk of a parse tree, where A is a


repeating nonterminal symbol, and α and β are strings of nonterminals. In
general this parse tree will have the structure shown below.
Let v and x be the yields, respectively, from the left branches and the right
branches of Aα as illustrated in the figure below. Let w be the yield
generated by the branches from the “tip” Aβ , and let u and y be,
respectively, the yields from the left subtrees and the right subtrees, if any,
of the trunk. S

A
y
u α
v A x

β
w
36
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment
We see that string uvwxy ∈ L(G), and for all i ≥ 0, uviwxiy ∈ L(G), because if
Aα Aβ is a legal pattern on a trunk, then for all i ≥ 0, (Aα )iAβ is also a legal
pattern. Let VN be the nonterminal alphabet of the grammar. The length of string z
for which every parse tree has a trunk with a repeating pattern is proportional to |
VN|, i.e., for some constant p, |z| ≤ p = c|VN|. It follows that |vwx| ≤ p and |vx| ≥ 1.
S
S

A
A y
y α
u
u α v x
A
v A x
α
β v x
A
w
β
w
37
Proof of the CFL Pumping Lemma Hierarchy: Proper Containment

Summarizing the above observations, we have the following pumping lemma


for CFL’s.

The pumping lemma for context-free languages. For every infinite CFL L, there
is a constant integer p that satisfies the following conditions:

Let ∑ be the alphabet of the language. Then for every string z ∈ L whose length is
greater than or equal to p, there are u, v, w ∈ ∑* such that
• z = uvwxy, |vwx| ≤ p, |vx| ≥ 1
• for all i ≥ 0, uviwxiy ∈ L

38
Hierarchy: Proper Containment

Rumination (1): Application of the Pumping Lemma


• To prove that {aibi | i ≥ 1} is not regular, we showed that for this language the pumping lemma does not hold. Let’s review
how we dealt with the parts of the lemma quantified existentially (i.e., prefixed with “there is,” “there exist” or “there are”)
or universally (i.e., prefixed with “for all” or “for every”). To show that the lemma does not hold for the language, we must
consider all possible cases for each part existentially quantified. To contrary, for each part universally quantified, it is enough
for us to pick just one case (for our convenience) that will lead to a contradiction.
The lemma uses an existential quantification in the following two parts;
(2) There is a constant n …
(4) There are strings u, v and w . . .
So, in the proof, by using n as a variable, we took into consideration of all possible constant values of n, and examined all
possible strings u, v, and w that satisfy the conditions given in (i) – (iii) of the lemma.
In the following three parts, the lemma uses a universal quantification;
(1) For every infinite regular language,
(3) For every string z ∈ L,
(4)-(iv) For all i ≥ 0,
For part (1) we picked language {aibi | i ≥ 1} that we supposed to be a regular language, for part (3) we picked z = anbn, and
for part (4)-(iv) we picked i =2. (If we wanted, we could have chosen others than z = anbn and i = 2.)

39
Rumination (1): Application of the Pumping Lemma Hierarchy

• Here is an application example of the pumping lemma with a logical bug that we often come across while grading
homework assignments.

Question: Is the language L = {abcaiabc | i ≥ 0} regular? Justify your answer.


Answer: L is not regular. Suppose L is regular. Then L should satisfy the pumping lemma. Let n be the constant of the lemma,
and choose string z = abcanabc, which is in L. Since |z| ≥ n, string z satisfies condition (2) of the lemma. Let z = uvw, and
consider the case where u = a, v = b, and w = canabc. Then we get uv2w = abbcanabc, which is not in L. It follows that L does
not satisfy the pumping lemma. L is not regular language!!

We can show that L is regular by showing an FA which recognizes L (see the figure below). What is wrong in the above
argument? It is in the argument dealing with the existential quantification for strings u, v, and w in part (4) of the lemma. The
argument took u, v, and w only for the specific values of u = a, v = b, and w = canabc, not considering other cases satisfying
conditions (i)-(iii). We should show that uviw ∉ L, for all possible locations of the “pump site” v in the prefix of length n of
string z and for all possible lengths of v (i.e., 1≤ |v| ≤ n). Actually, for the case where v contains some a’s in the substring an, it
satisfies that for all i ≥ 0, uviw ∈ L. The proof fails.

(4) There exist strings u, v and w that satisfy the


following conditions. a b c a b c
(i) z = uvw,
(ii) |uv| ≤ n,
(iii) |v| ≥ 1,
(iv) for all i ≥ 0, uviw ∈ L. a

40
Rumination (1): Application of the Pumping Lemma Hierarchy

• The pumping lemma presents a common property of all infinite regular languages. Recall the last claim of the lemma, i.e.,
“for all i ≥ 0, uviw ∈ L”, which is based on the existence of a cyclic path in the state transition graph of every DFA which
recognizes an infinite regular language. Notice that every finite language is regular. Let L = {w1, w2, . . . , wk} be a finite
language. Then we can simply construct a regular grammar with rules S → w1 | w2 | . . .| wk. Therefore, every non-regular
language is infinite.

• We know that every infinite regular language satisfies the pumping lemma. Is the converse of this statement true? In other
words, can we say that if an infinite language satisfies the pumping lemma, then the language is regular? We didn’t prove that
every infinite language satisfies the pumping lemma. We proved it only for infinite regular languages. Actually, Appendix E
shows a non-regular CFL which satisfies the lemma. Thus, it is wrong to prove the regularity of a language by showing that it
satisfies the pumping lemma.

Application of the pumping lemma for CFL


• When we apply the pumping lemma for CFL, we should keep in mind the difference between the lemmas for regular
languages and CFL’s. Recall that for regular languages, the region x, where we can pump, is limited in the prefix of length n of z,
while for the lemma for CFL’s, the two regions v and x, where we can pump, can be anywhere in z within the window of width |
vwx| ≤ p. Thus, for CFL’s it requires more cases to analyze.

41
Hierarchy: Proper Containment

Rumination (2): Application of Ogden’s Lemma


• Often we hear that programming languages are context-free. However, they are not clean context-free languages, and carry
some context-dependency in their syntax. For example, in C it is not allowed to declare a variable more than once in a block,
and in FORTRAN there are statements requiring a matching label (e.g., 10 in the following DO-loop).

....
DO 10 I = 1, 100, 1
SUM = SUM + I
10 PRO = PRO * I
....

The formal language L below has a similar context-dependency of programming languages. String x corresponds to the label
(i.e., a string of decimal digits) and ai corresponds to the loop body. Using Ogden’s lemma we showed that L is not a context-free
language.

L = {xaix | i > 0, x ∈ {0, 1}+} ∪ {x | x ∈ {0, 1}+}

42
Hierarchy: Proper Containment
Exercises
12.1 Using the pumping lemma for regular languages, we are going to prove that L = {xxR | x ∈{a, b, c}* }is not regular as
follows. Complete the proof by answering each of the following questions.

Proof. Suppose that L is regular, and let n be the constant of the lemma. All the following strings are in L.
(a) abccba (b) a100 b100 b100 a100 (c ) an/2 bn/2 b n/2 an/2 (d) anbnbnan

Question 1: Which of the strings above are you going to pick as the string z for your proof? Why are other strings not your
choice? If you don’t like any, choose another string in L and explain why.

Question 2: Consider strings u, v, w ∈{a, b, c}* that satisfy the conditions (i) z = uvw, (ii) |uv| ≤ n, and (iii) |v| ≥ 1. What
will be in v? Briefly explain all possible contents of string v.

Question 3: To show that L is not regular, how are you going to use the last part of the lemma: “For all i ≥ 0, string uviw ∈
L.?” Write your argument.

12.2 Which of the following languages are regular? Justify your answer.
L1 = {aibjck | i, j, k > 0 } L2 = {aaaaibjci | i, j > 0 }

12.3 For the symbols a, b and c, and a string x, let #a(x), #b(x) and #c(x), respectively, be the number of a’s, b’s and c’s in
string x. Which of the following languages are context-free? Justify your answer.
L3 = { x | x ∈{a,b,c}+, and #a(x) > #b(x) }
L4 = { x | x ∈{a,b,c}+, and #a(x) = #b(x) = #c(x) }
L5 = {x | x ∈{a,b,c}+, and #a(x) < #b(x) < #c(x) }

12.4 We learned that language { w#wR | w ∈ {a, b}* } is context-free. The following languages are not context-free. Prove
why.
L6 = {w#w | w ∈ {a, b}* } L7 = {xyx | x, y ∈ {a, b}* }

43

You might also like