Closure properties of context-free languages Closure properties of context-free languages
Proof, cont.: Then the languages L1 ∪ L2 and L1 L2 are generated
Theorem (Closure properties of the context-free languages) by the context-free grammars The class of context-free languages is closed under union, (N1 ∪ N2 ∪ {S}, T1 ∪ T2 , P1 ∪ P2 ∪ {S → S1 |S2 }, S) and concatenation, and Kleene closure, i.e., (N1 ∪ N2 ∪ {S}, T1 ∪ T2 , P1 ∪ P2 ∪ {S → S1 S2 }, S), (i) if L1 and L2 are context-free, then L1 ∪ L2 is context-free, (ii) if L1 and L2 are context-free, then L1 L2 is context-free, respectively, where S is a new variable symbol and one has to (iii) if L is context-free, then L∗ is context-free, assume that N1 and N2 are disjoint. (iii) For a given context-free language L, let G = (N, T , P, S) be a Subsequently, we give a general form of the theorem and its proof context-free grammar where L = L(G ). in terms of substitutions of languages. Then the language L∗ is generated by the context-free grammar Proof. (i), (ii) For given context-free languages L1 and L2 , let G1 = (N1 , T1 , P1 , S1 ) and G2 = (N2 , T2 , P2 , S2 ) be context-free (N ∪ {S 0 }, T , P ∪ {S 0 → SS 0 |λ}, S 0 ) grammars where L1 = L(G1 ) and L2 = L(G2 ). where S 0 is a new variable symbol. t u
Closure properties of context-free languages Closure properties of context-free languages
Theorem (Closure properties of the context-free languages) Definition (Substitution with context-free languages) The class of context-free languages is neither closed under Let T be an alphabet and for every a ∈ T let La be a language. intersection nor under complement, i.e., Then given a word w = a1 · · · an where a1 , . . . , an ∈ T , n > 0, let (i) L1 and L2 context-free does not imply L1 ∩ L2 context-free, Lw = La1 ···an = La1 · · · Lan , and Lλ = {λ}. (ii) L context-free does not imply that L is context-free. i.e., the language Lw is obtained by concatenating languages of Proof. (i) The languages the form La according to the symbols of w . S L1 = {0m 1m 0n : m, n > 0} and L2 = {0m 1n 0n : m, n > 0}, For a language L over T , we refer to w ∈L Lw as the language obtained from L by substitution of La for a. are both context-free but have the non-context-free intersection L1 ∩ L2 = {0m 1m 0m : m > 0}. Theorem (Closure under substitution) (ii) The context-free languages are closed under union, hence The class of context-free languages is closed under substitution by closure under complement would imply closure under intersection context-free languages. I.e., given contex-free languages La for by de Morgan’s law L1 ∩ L2 = L1 ∪ L2 . t u every a in some alphabet TSand some context-free language L It can be shown by specifying an appropriate grammar that the over T , then the language w ∈L Lw is context-free. complement of the language {0m 1m 0m : m > 0} is context-free. Closure properties of context-free languages Closure properties of context-free languages Sketch of proof. Let a context-free grammar G = (N, T , P, S) and Recall that a homomorphism is a mapping h : Σ∗1 → Σ∗2 that context-free languages La for every a ∈ T be given. Let L = L(G ). satisfies h(uv ) = h(u)h(v ) and is hence already determined by the We construct a context-free 0 0 0 values h(a) for all a ∈ Σ1 . S grammar (N , T , P , S) that generates the language w ∈L Lw . Corollary (Closure properties of the context-free languages) For each a ∈ T , pick a context-free grammar Ga = (Na , Ta , Pa , Sa ) The class of context-free languages is closed under union, where La = L(Ga ) such that the sets Na are mutually disjoint and concatenation, Kleene closure, and homomorphisms. are all disjoint from N. Replace in the rules of P each a ∈ T by Sa and for the new set P Proof. The corollary follows easily by the theorem on closure under let substitutions, as we show for the last two assertions. [ [ [ Let L be a contextfree language L over some alphabet Σ. P0 = P ∪ Pa , N 0 = N ∪ Na , and T 0 = Ta . a∈T a∈T a∈T The Kleene closure of L can be expressed via the S substitution of L ∗ ∗ for a in {a} , i.e., we let La = L and get L = k≥0 Lak . By convention Lλ = {λ}, hence the new grammar must generate λ Moreover,Sgiven a homomorphism h, then h[L] = {h(w ) : w ∈ L} is in case G does. The latter requirement is satisfied because rules equal to w ∈L Lw if we let La = {h(a)} for all a ∈ Σ. t u of G that may be used to generate λ remain unchanged. t u
Closure properties of context-free languages Closure properties of context-free languages
Recall that given a homomorphism h : Σ∗1 → Σ∗2 , the inverse
Proof, cont. For the PDA A1 = (Q1 , Σ1 , Γ, ∆1 , s1 , Z , F1 ) meant to homomorphism h−1 maps any langugage L over Σ2 to the language recognize the language L1 = h−1 [L] we let h−1 [L] = {w ∈ Σ∗1 : h(w ) ∈ L}. Q1 = Q2 × E where E = {v ∈ Σ∗2 : h(a) = uv for some a ∈ Σ1 and u ∈ Σ∗2 } Theorem (Closure under inverse homomorphisms) s1 = (s2 , λ), F1 = {(q, λ) : q ∈ F2 }. The class of context-free languages is closed under inverse homomorphisms. The transition relation ∆1 satisfies for all q ∈ Q2 , a ∈ Σ2 , and X ∈ Γ Proof. Let h : Σ∗1 → Σ∗2 be a homomorphim and let L2 be a ((q, λ), a, X ) ∆1 ((q, h(a)), X ). context-free language over Σ2 , which is recognized by some PDA A2 = (Q2 , Σ2 , Γ, ∆2 , s2 , Z , F2 ) via final state. This way, during the simulation of A2 , the next symbol a of the input can be read in order to provide the word h(a) as part of the We construct a PDA A1 that recognizes L1 = h−1 [L2 ] via final input h(w ) of the simulated computation. state by simulating on input w the PDA A2 on input h(w ). Closure properties of context-free languages Closure properties of context-free languages Proof, cont. Furthermore, for all q, q 0 ∈ Q2 , b ∈ Σ2 , v ∈ Σ∗2 , Theorem (Closure under intersection with a regular language) X ∈ Γ, and γ ∈ Γ∗ we let The class of context-free languages is closed under intersection with regular languages, that is, for every context-free language L ((q, v ), ε, X ) ∆1 ((q 0 , v ), γ) ⇔ (q, ε, X ) ∆2 (q 0 , γ), and regular language R, the language L ∩ R is context-free. ((q, bv ), ε, X ) ∆1 ((q 0 , v ), γ) ⇔ (q, b, X ) ∆2 (q 0 , γ). Proof. Let the language L be context-free and the language R be This way, the computation of A2 is simulated by reading the bits of regular, where the virtual input h(w ) from the second component of the current L = Lf (M) for some PDA M = (QM , Σ, Γ, ∆, sM , Z , FM ), state of A1 . R = L(A) for some DFA A = (QA , Σ, δ, sA , FA ). It can be shown by induction over the word length that for Then the PDA M⊗ = (QM × QA , Σ, Γ, ∆⊗ , (sM , sA ), Z , FM × FA ) all w ∈ Σ∗1 , all q ∈ Q and all γ ∈ Σ∗2 the following are equivalent accepts L ∩ R by final state if ∆⊗ is defined such that for M1 ,∗ (i) ((s2 , λ), w , Z ) =⇒ ((q, λ), λ, γ), all q, q 0 ∈ QM , p, p 0 ∈ QA , a ∈ Σ, X ∈ Γ, and γ ∈ Γ∗ , we let M2 ,∗ (ii) (s2 , h(w ), Z ) =⇒ (q, λ, γ), ((q, p), ε, X ) ∆⊗ ((q 0 , p), γ) ⇔ (q, ε, X ) ∆ (q 0 , γ), hence M1 recognizes h−1 [L2 ]. t u ((q, p), a, X ) ∆⊗ ((q 0 , p 0 ), γ) ⇔ (q, a, X ) ∆ (q 0 , γ) and δ(p, a) = p 0 .
Closure properties of context-free languages Closure properties of context-free languages
Corollary (Difference of a context-free and a regular language) Proof, cont. On any given input w , the PDA M⊗ simulates the Let the language L be context-free and the language R be regular. PDA M and the DFA A in parallel. Then the language L \ R = {w ∈ L : w ∈ / R} is context-free. The simulated computation of A proceeds only when the simulated Proof. We have L \ R = L ∩ R, where the set R is regular. t u computation of M, hence also M⊗ , reads a symbol from the input. For all w ∈ Σ∗ , q ∈ QM , p ∈ QA , γ ∈ Γ∗ , and n ≥ 0 the two Example (Another language that is not context-free) following assertions are equivalent The language L = {ww : w ∈ {0, 1}∗ } is not context-free. M⊗ ,n (i) ((sM , sA ), w , Z ) =⇒ ((q, p), λ, γ), For a proof, consider the regular language M,n (ii) (sM , w , Z ) =⇒ (q, λ, γ) and δ(sA , w ) = p. R = {0t1 1t2 0t3 3t4 : t1 , t2 , t3 , t4 ≥ 0}. The equivalence holds by construction of M⊗ and can be shown by induction on n. Using the pumping lemma for context-free languages it can be shown that From the equivalence it is immediate that M⊗ accepts exactly the {0m 1n 0m 1n : m, n ≥ 0} = L ∩ R, inputs accepted by M and by A, i.e., the inputs in L ∩ R. t u is not context-free. So if L were context-free, this would contradict the theorem above. t u Closure properties of context-free languages
Remark (Closure under mirror languages)
The class of context-free languages is closed under transition to the mirror language, i.e., if a context-free language L is context-free, then the mirror language LR = {w R : w ∈ L} is context-free, too. For a proof, let the language L be context-free. Choose a context-free grammar G = (N, T , P, S) where L = L(G ). Then the language LR is generated by the context-free grammar G R = (N, T , P R , S) where
P R = {X → w R : X → w ∈ P}.
It can be shown by induction over n for all words w ∈ (N ∪ T )∗
G ,n G R ,n and all n that we have S =⇒ w if and only if S =⇒ w R .