You are on page 1of 13

Theorem: If G is in CNF, w  L(G) and |w| > 0, then

any derivation of w in G has length 2|w| - 1

Proof by induction on |w|:


Base case: If |w| = 1, then any derivation of w must
have length 1 (S → a)
Inductive step : Assume true for any string of length at
most k ≥ 1, and let |w| = k+1
Since |w| > 1, derivation starts with S → AB
So w = xy where A * x, |x| > 0 and B * y, |y| > 0
By the inductive hypothesis, the length of any
derivation of w must be 1+(2|x|-1)+(2|y|-1)=2(|x|+|y|)-1
1
Lemma: Yield of a CNF parse tree is
|w|  2n-1
Proof
•Base Case: n = 1
– If the longest path is of length 1, use the rule A→t, so |w|= 1
and 21-1= 1
•Induction
– Longest path has length n, where n>1. The root uses a
production of the form A→BC (no terminal from the root)
– By induction, the subtrees from B and C have yields of length
at most 2n-2 since one of the edges from the root to these
subtrees is used
– The yield of the entire tree is the concatenation of these two
yields, which is 2n-2 + 2n-2 which equals 2*2n-2 = 2n-2+1=2n-1
Pumping lemma for context-free
languages
 It states that every context-free language has a
special value called the pumping length such
that all longer strings in the language can be
"pumped."
 The string can be divided into five parts so that
the second and the fourth parts may be
repeated together any number of times and the
resulting string still remains in the language.

3
Formal definition of the Pumping
Lemma for CFL’s
Let L be a CFL. Then there exists a constant p
such that if z is any string in L where |z|  p, then
write z = uvwxy subject to the following
conditions:
1. |vwx|  p. p is the pumping length
2. vx  ε. v and x both may not be empty.
3. For all i  0, uviwxiy is also in L.
pump both v and x.
Proof idea
Use CNF of CFG. The parse tree is a binary tree.
Let G have m variables. Choose m for the longest path in the tree.
– Let constant p = 2m.
– Suppose a string z = uvwxy where |z|  p is in L(G)
– Choose a parse tree of length m+1
– Any parse tree that yields z must have a path of length at least
m+1, then yield will be 2m or less

• a string in L of length m or less cannot be used since it has a


yield of 2m-1 or less.
• Since p = 2m, then 2m-1 is equal to p/2.
• This means that z is too long to be yielded from a parse tree of
length m.
Parse Tree
• z = uvwxy where |z|  p
A0

A1
A2

Ak

• Variables A0,A1, … Ak
• If km then at least two of these variables must be
the same, since only m unique variables
Parse Tree (cont.)
Suppose the variables are the same at Ai=Aj
where k-m  i < j  k
A0

Ai=Aj although we
may follow different
Ai production rules for
each
Aj

u v w x y
Condition 2: vx ≠ 
A0
• Production Ai follows to Aj and
can’t be a terminal or there would
be no Aj.
• Therefore it has two variables; one Ai
of these must lead to Aj and the
other must lead to v or x or both. Aj
• This means v and x cannot both be
empty but one might be empty.
u v w x y
Condition 1: |vwx|  p
• This says the yield of the
subtree rooted at Ai is  p
• Since the tree has the A0
longest path m+1, it
follows that
|vwx|  p  2m+1-1 Ai

(Ai could be A0 so vwx is the Aj


entire tree)
u v w x y
Condition 3: for all i  0, uviwxiy is
also in L
• Note that the symbol Ai=Aj
• It means that different
production rules can substitute A0

for each other


• Substituting Aj for Ai the
resulting string must be in L Aj

u y
Condition 3: (cont.)
A0
• Substituting Ai for Aj

• Result: Ai

uv1wx1y, uv2wx2y, etc. AAj i

u v Aj x y

v w x
Example: Language L = {aibici: i > 0} is not
context-free
• Use the Pumping Lemma and Proof by contradiction
Suppose L is context-free. If string X L, and |X| > p,
then X=uvwxy, where |vwx|  p.
Choose i that is greater than p. Then, wherever vwx
occurs in the string aibici, it cannot contain more than
two distinct letters-it can be all a's, all b's, all c's, or it
can be a's and b's, or it can be b's and c's.
Thus the string vx cannot contain more than two distinct
letters; but by the pumping lemma, it cannot be
empty, either, so it must contain at least one letter.

12
L = {aibici: i > 0} is not context-free (cont.)

• Since uvwxy is in L, uv2wx2y must also be in L.


Since v and x can't both be empty, |uv2wx2y| >
|uvwxy|, so we have added letters. But since vx does
not contain all three distinct letters, we cannot have
added the same number of each letter. Thus uv2wx2y
cannot be in L.
• This is a contradiction. Therefore the original
assumption, that L is context free, must be false.

13

You might also like