You are on page 1of 19

For the course of “Automi e Linguaggi” 2014/2015

Chen Tiantian
18/feb/2015

Solution to the exam of 12th February 2015


• PARSER
S → aBSBa | ε B → bBb | ε

LEMMA 1. L(B) = (bb)*


We must show: i) L(B) ⊆ (bb)*
ii) (bb)* ⊆ L(B)
PROOF (i) By rule induction on L(B)

The rules for the language L(B) are as follows:

(B1) (B2) ∀w ∈ L(B),

by rule B1 we have to show that : ε ∈ (bb)*. This is obvious.


by rule B2 we have to show that: bwb ∈ (bb)*.
Indeed bwb ∈ {by inductive hypotesis} ∈ b(bb)*b ∈ (bb)*

PROOF (ii) By induction on n.


(Basis) n = 0. To show ε ∈ L(B). This holds for rule B1.
(Step) Assume (bb)n ∈ L(B). To show: (bb)n+1 ∈ L(B)
Indeed: (bb)n+1 ⊆ (bb)nbb ⊆ b(bb)nb ⊆ { by inductive hypotesis } ⊆ bL(B)b
⊆ {because B → bBb} ⊆ L(B)

Now in order to find L(S), let’s try by fixed point method:

S | ∅ | ε | a(bb)*(bb)*a | a(bb)*(a(bb)*(bb)*a) (bb)*a |

The fixed point should be (a(bb)*)*a + ε with even number of a’s

Automi e linguaggi !1
Let’s prove L(S) = (a(bb)*)*a + ε with even number of a’s

We must prove: i) L(S) ⊆ (a(bb)*)*a + ε with even number of a’s

ii) (a(bb)*)*a + ε with even number of a’s ⊆ L(S)

The rules for the language L(S) are as follows:

(S1) (S2) ∀w ∈ L(S),

Now let’s define the property p(w) =: ∃k ∈ N. |w|a = 2k

PROOF (i) by rule induction on L(S)

by rule (S1) we have to show that:


1) ε ∈ (a(bb)*)*a + ε. This is obvious.
2) p(ε)
p(ε) ⇒ k = 0 holds because 2*0 is even
by rule (S2) we have to show that:
1) aBwBa ⊆ (a(bb)*)*a + ε
Indeed aBwBa ⊆ aL(B)wL(B))a ⊆ {by Lemma 1} ⊆ a(bb)*w(bb)*a
⊆ { by inductive hypothesis } ⊆ a(bb)*(a(bb)*)*a(bb)* a ⊆
{ because a(bb)* ⊆ (a(bb)*)* } ⊆ (a(bb)*)*a
2) p(a(bb)*w(bb)*a)
p(a(bb)*w(bb)*a) ⇒ |a|a + |(bb)*|a + |w|a + |(bb)*|a + |a|a =
{ by hypothesis p(w) holds } = 1 + 0 + 2kw + 0 + 1 = 2kw + 2 = 2( kw+1 )

PROOF (ii) by induction on k half of number of a’s

(Basis) k = 0. To show To show ε ∈ L(S). This holds for rule S1.

(Step) Assume: (a(bb)*)*a with 2k a’s ⊆ L(S).


To show: (a(bb)*)*a with 2(k+1) a’s ⊆ L(S)

OBSERVATION: every word with 2k a’s which belongs to (a(bb)*)*a is of theform:


(a(bb)*)2k - 1a. So a word with 2(k + 1) a’s has to be of the form (a(bb)*)2(k +1) - 1 a.

So assumed: (a(bb)*)2k - 1a ⊆ L(S).


Let’s show that (a(bb)*)2(k +1) - 1a ⊆ L(S).
Indeed (a(bb)*)2(k +1) - 1a ⊆ (a(bb)*)2k +2 - 1a ⊆ a(bb)*(a(bb)*)2k - 1a(bb)*a ⊆
{ by hypothesis } ⊆ a(bb)* L(S) (bb)*a ⊆ {Because S → aBSBa } ⊆ L(S).

Automi e linguaggi !2
To build an automaton for (a(bb)*a)* + ε with even number of a’s we have to build separately
two automata:

A1 to accept strings from (a+b)* an even number of a’s

A2 to accept (a(bb)*)*a + ε

A1 is simply made this way and A2:

Now we have to build an automaton A3 resulting by multiplicating A1 and A2:

a b
0A 1B -
1B 1A 2B
2B - 3B
3B 1A 2B
1A 1B 2A
2A - 3A
3A 1B 2A

So we obtain A3:

Automi e linguaggi !3
By renaming the states: We get:

a b
0A = 0 0 1 7
1B = 1 1 4 2
2B = 2 2 7 3
3B = 3 3 4 2
1A = 4 4 1 5
2A = 5 5 7 6
3A = 6 6 1 5
sink = 7 7 7 7

The parser translated in Java code is encapsulated in the following class:

public class FiniteAutomatonParser


{
private static String stp;
private static int lstp;
private static int state = 0;

private static boolean isFinal(int state)


{
return (state == 1 || state == 4);
}

private static int[][] transitionFunction = { { 1, 7 }, { 4, 2 },


{ 7, 3 }, { 4, 2 }, { 1, 5 }, { 7, 6 }, { 1, 5 }, { 7, 7 } };

public static void main(String[] args)


{
stp = args[0];
lstp = stp.length() - 1;
for (int ps = 0; ps <= lstp; ps++)
{
state = transitionFunction[state][stp.charAt(ps) - 'a'];
}
if (isFinal(state))
System.out.println("Accepted by the automaton");
else
System.out.println("Not accepted");
}
}

Automi e linguaggi !4
ii) A regular expression that denotes the language L(S) can be obtained, by Kleene theorem,
from the automaton A3 transforming it to a regular expression.

The algorithm to get a regular expression from A3 is the one from “Automata Theory and
Formal Languages”, Third Edition by Alberto Pettorossi, on page 47.

The resulting regular expression is:

(a(a + b(bb)*ba)((a + bb(bb)*a)(a + b(bb)*ba))* + ε

Automi e linguaggi !5
• Reasoning

We get an equivalent grammar to the given one by unfolding A in B which is:

A → BB B → aBBa | BBb | ε

The rules for the language L(B) are as follows:

(B1) (B2) ∀w1, w2 ∈ L(B), 1 2


1 2

(B3) ∀w1, w2 ∈ L(B), 1 2


1 2

LEMMA 1) L(B)L(B)b* ⊆ L(B)L(B)

PROOF) by induction on n number of b’s

(Basis) n = 0: L(B)L(B) ⊆ L(B)L(B) this is obvious.


(Step) assume: L(B)L(B)b ⊆ L(B)L(B)
n

to show: L(B)L(B)bn+1 ⊆ L(B)L(B)

L(B)L(B)bn+1 ⊆ L(B)L(B)bnb ⊆ { by hypothesis } ⊆ L(B)L(B)b ⊆


⊆ { by B → ε } ⊆ L(B)L(B)L(B)b ⊆ { by B → BBb } ⊆ L(B)L(B)

LEMMA 2) (a+b)* with even number of a’s ⊆ L(B)L(B)

PROOF)
(Basis) k = 0: b* ⊆ L(B) holds by Lemma 4 (below).
(Step) assume ∀v ∈ (a+b)*. |v|a = 2k ⇒ v ∈ L(B)L(B)
To show: ∀w ∈ (a+b)*. |w|a = 2(k + 1) ⇒ w ∈ L(B)L(B)
we have that every word w ∈ (a+b)* with 2(k + 1) a’s is of the form:
bmavabn for some m, n ≥ 0.
bmavabn ⊆ { by hypothesis } ⊆ bmaL(B)L(B)abn ⊆ b*aL(B)L(B)ab* ⊆
⊆ { by B → aBBa} ⊆ b*L(B)b* ⊆ { by Lemma 3 } ⊆ L(B)L(B)b* ⊆
⊆ { by Lemma 1} ⊆ L(B)L(B)

NOTE: w can be written as bmavabn for some m, n ≥ 0 because we have that every word from
(a+b)* with even number of a’s is of the form a*b*a*b*, or well concatenations of a*’s and b*’s
beginning with a’s or b’s, with the only condition that the number of a’s is even.
Now for w where we have 2(k+1) a’s so we can surely collect an internal substring of w as v
which has 2k a’s and leave an ‘a’ on the left of v and another on its right.
Moreover we put b* at the beginning and at the end because there may be also 0 or more b’s.
The b’s right after the leftmost a and right before the rightmost a are collected into v, too.

Automi e linguaggi !6
LEMMA 3) b* ⊆ L(B)
PROOF ) by induction on n
(Basis) ε ⊆ L(B). Holds because B → ε
(Step) assume for: bn ∈ L(B)
To show: bn+1 ∈ L(B)
bn+1 ⊆ bnb ⊆ { by hypothesis } ⊆ L(B)b ⊆
⊆ { because B → ε } ⊆ L(B)L(B)b ⊆
⊆ { because B → BBb } ⊆ L(B)

Now we have that L(A) = L(B)L(B) and by Lemma 2 we have that


(a+b)* with even number of a’s ⊆ L(B)L(B) ⊆ L(A)
So we have that ∀w ∈ ( a+b )* with even number of a’s, A →+ w

• LANGUAGES

Automi e linguaggi !7
Show that a* ∪ {ambn | m > n ≥ 0}* is not a regular language and it satisfies the lemma for
regular languages.

OBSERVATION: we have that {ambn | m > n ≥ 0} is clearly a context-free language because in


order to build an automaton which accepts this language, we must at least count an index, m
or n in order keep the number of a’s greater than the number of b’s. To do so we need a stack,
so the language can be accepted only the a PDA.
Moreover we know that the context-free languages are closed under the Kleene star
operation. In fact given a language L produced by a grammar with axiom S, the context-free
language corresponding to L* is produced adding a new production S → SS: the resulting
grammar would still produce a context-free language.

So we have that {ambn | m > n ≥ 0}* is a context-free language.


The context-free languages are closed under union, and this is a case where we have a union
between a regular language, which is a proper subset of context-free languages, and a
context-free.

So the grammar a* ∪ {ambn | m > n ≥ 0}* is a context-free language.


But the pumping lemma may still be satisfied. Let’s see if this is the case:

for a*:

We can take p = 1 and all the strings of length ≥ 1 are concatenation of one or more ‘a’.

for any w ∈ a*. |w| ≥ p we can always take:


x = ε, y = w, z = ε
pumping on y gives strings made of concatenation of a’s and they all still belong to a*
because it produces all possible concatenations of a’s of any number.

for {ambn | m > n ≥ 0}* :

Let’s get pumping length p = 1;

we have that: the possible strings belonging to {ambn | m > n ≥ 0}* of length > 1 are
concatenations of {ambn | m > n ≥ 0}, while a string of length = 1 must be only ‘a’.
So we have two cases:
1) for ‘a’ we have that it satisfies the pumping lemma by picking:
x = ε, y = a, z = ε
Pumping on y indeed produces ε, which belongs to {ambn | m > n ≥ 0}* due to the Kleene
star, or concatenations of a’s that still belong to {ambn | m > n ≥ 0}*
(it’s enough to concatenate strings with m = 1 and n = 0).

2) for strings w of length > 1 we can as well take:


x = ε, y = w, z = ε
Indeed pumping on w itself generates strings that still belong to {ambn | m > n ≥ 0}*.
• PDA’s and Context-free Grammars

Automi e linguaggi !8
i) To prove that the set of languages accepted by deterministic PDA’s by final state properly
includes the set of languages accepted by deterministic PDA’s by empty stack, we must
prove this two points:

1) For any deterministic PDA M which accepts a language L by empty stack there
exists an equivalent deterministic PDA which accepts L by final state.

2) For any deterministic PDA M1 which accepts a language L by final state, it may
not exist an equivalent deterministic PDA M which accepts L by empty stack.

PROOF 1) Let us consider the PDA M = < Q, Σ, Γ, δ, q0, Z0, ∅ > which accepts by
empty stack and the language L(M) it accepts.
We must build a PDA M’ which accepts by final state equivalent to M.
Now we take M’ to be < Q ∪ {q’0, qf }, Σ, Γ ∪ {$}, δ, q0, $, {qf} >
where q’0 is a new initial state and qf is a new final state and $ is a new stack
symbol.
To get the new transition function, we must add the new instructions to δ:

• q’0 ε $ —> push Z0$ goto q0

This instruction is necessary to cause M’ to put the special symbol $ in the


bottom of the stack. This symbol is necessary for the acceptance in the end.
Note that if M erases its entire stack, M’ does it too, with exception of $ which
stays at the bottom of the stack.

• for each q ∈ Q,
q ε $ —> push ε goto qf
These instructions simulate the acceptance of M by empty stack. If in a
state the stack is empty, M accepts the input string. This must happen in M’
too. In order to do so, whenever M accepts, M’ makes a transition to its unique
final state qf, causing M to accept, as well, by final state.

PROOF 2) Let us consider the language

L = { aibi | i ≥ 1}

This language is accepted by a DPDA M1 by final state, as we will also show in the
point (ii) of this exercise.
Let’s assume that DPDA’s which accept by empty stack are equally powerful as the
DPDA’s which accept by final state.
L contains also the strings ‘ab’ and ‘abab’.
By our assumption, there must exist a deterministic DPDA M2 which accept by empty
stack that also accepts ‘ab’ and ‘abab’.
But in the case of the string ‘abab’, we have that when the DPDA read the substring
‘ab’ , it has its stack empty, and thus cannot make any move. This implies that it
cannot accept ‘abab’. We have found a language that is accepted by a DPDA by final
state but cannot be accepted by a DPDA by empty stack.

ii)

Automi e linguaggi !9
We have that { aibi | i ≥ 1 } is a deterministic context-free language.
Indeed this language can be accepted by the following DPDA by final state:

Given any language A, B, C, D we know the following facts::

1) If A is regular, then so is ¬A.


2) If B is deterministic context free, then so is ¬B.
3) If C is regular and D is deterministic context free, then C ∩ D is deterministic context-free
4) {a, ab}* - {ai bi | i ≥ 1} = {a, ab}* ∩ { ai bj | i ≠ j ∧ i , j ≥ 0}
Because A - B = A ∩ ¬B.
5) Deterministic context-free languages are a proper subset of context-free languages.

Now the language {a, ab}* is accepted by the following finite state automaton:

So by the Kleene’s theorem we have that


{a, ab}* is a regular language.

Indeed by (1), (2), (3), (4) we can say that {a, ab}* - {ai bi | i ≥ 1} is a deterministic context-
free language and by (5) is also a context-free language.

• LR(1)

Automi e linguaggi !10


S → BBb B → Bc | a

Powerset construction:

Automi e linguaggi !11


Parsing table for the LR(1) parser

a b c $ B
q0 sh 6 - - - goto 1
q1 sh 9 - - accept goto 2
q2 - a sh 3 b sh 8 c - B-
04HE7
q3 - 8I - - - - - 5F19HEC
-
q8I
4 - - red 1 - red 1 - - --
5F19HEC
q5 red 1DI - - red 1 6G - 2AF
-
q6G
6 red 2 - - - red 2 - - --

q2AF
7 - - red 2 3 red 2 BG - --
3 - - - -
with reduce productions:
BG - - - -
0. S → BBb
DI - - - -
1. B → Bc
2. B → a

Automi e linguaggi !12


There are states that contain the same items. Precisely the couples (4, 5), (6, 7).
So the resulting parsing table of the LALR(1) parser corresponding to the before LR(1) by
fusing q4 and q5 into q45 and q6 and q7 into q67 is:

a b c $ B

q0 sh 6 - - - goto 1

q1 sh 9 - - accept goto 2

q2 - sh 3 sh 8 - -

q3 - - - - -

q45 red 1 red 1 red 1 - -

q67 red 2 red 2 red 2 - -

ii) A grammar which is LR(1) and not LL(1) is:

S → A | Aa A→b|ε

It is not LL(1) as we will show below building the corresponding parsing table,
because it produces conflicts on at least one entry.

First1(A) = { b, ε } Follow1(S) = { $ }
First1(Aa) = { b, ε } Follow1(A) = { a, $ }
First1(b) = { b }
First1(ε) = { ε }
a b $
S - S → A S → Aa S → A S → Aa
A A→ε A→b A→ε

This grammar produces only the strings { ba, b, ε, a }. These strings can be
derived only by one unique leftmost derivation as we’ll show now:

We have that this grammar is unambiguous and is LR(1).

Automi e linguaggi !13


We have made the shift/reduce LR(1) parser for the grammar G, and found out that
there isn’t any conflict in its parse table. This is enough to say that G is LR(1).

By the other hand G has B which has a left recursion and more than one production
rule: B → Bc | a
this is not allowed in an LL(1) grammar, because it leads to a conflict on the parsing
table as shown below.

S → BBb B → Bc | a

First1 (BBb) = {a}


First1 (Bc) = {a}
First1 (a) = {a}
a b $
S S → BBb - -
B B → Bc B→a - -

We have a conflict in (B, a) ⇒ not LL(1).

Automi e linguaggi !14


• DECIDABILITY

Answer to (i):

a) Recursively enumerable languages are closed under intersection. This fact can be
shown as follows: given two R.E. languages, in this case we have L1 and L2, there
exist two Turing machines M1 and M2 such that:

• L(M1) = L1
• L(M2) = L2.

If we run M1 and M2 in parallel and put the respective results in AND


and knowing the fact that if one of M1 or M2 loops, the resulting machine loops, this
resulting machine, let’s call it M3, is the Turing machine which accepts the language
L3 = { L1 ∩ L2 } because we have that:

1) if M1 accepts and M2 → M3 accepts


2) if M1 accepts and M2 loops → M3 loops
3) if M1 loops and M2 accepts → M3 loops
4) if M1 loops and M2 loops → M3 loops

So M3 accepts iff both M1 and M2 accept, and loop iff one of M1 or M2 loop or both loop.
This implies that L3 is still a R.E. language.

Automi e linguaggi !15


b) the language { ai bi ci | i ≥ 0 } is the union of:

• { ai bi ci | i > 0 } let’s call it L4, which is known to be a context sensitive language.


• { ε } whose membership problem is simply accept if input is empty, don’t otherwise
hence is a decidable language.

L4 is generated by the grammar G = <VT, VN, P, S> with the following productions in P:

S → aSBC bB → bb
S → aBC bC → bc
CB → BC cC → cc
aB → ab

with VT = { a, b, c } and VN = { B, C, S } and axiom = S;


The language L(G) generated by G is a recursive language. In order to prove this, we need to
show an algorithm which always halts and tell us if given any word w ∈ VT*,
w ∈ L(G) or not.
We can build a directed graph whose nodes are labeled by strings s in (VT ∪ VN)* such that
|s| ≤ |w|. These nodes are obviously of finite number because we have that VT and VN are
finite sets. Now for every couple of nodes s1 and s2 of this graph, we add an oriented arc from
s1 to s2 iff we can derive s2 from s1 by applying only once a single production rule of G.
The initial node is labeled S and the final node is labeled w.
Now the final step of this algorithm consists in applying a reachability algorithm to determine
whether or not there is a path from S to w. If it exists, w belongs to L(G), otherwise it doesn't
belong to L(G).

So there exists a Turing machine, call it M4, which implements the above algorithm and
answers the membership problem of the language { ai bi ci | i > 0 } ∪{ ε }, always halts and:

1) says “yes” iff (w ∈ L4 ∨ w = ε)


2) says “no” iff w ∉ L4

Now, from the previous points (a) and (b), we can build another Turing machine, call it M5
which includes M3 and M4 as its subroutine and is built this way:

Automi e linguaggi !16


We have that given an instance w in input:

1) if M3 accepts and M4 accepts → M5 accepts


2) if M3 accepts and M4 rejects → M5 accepts
3) if M3 loops and M4 accepts → M5 accepts
4) if M3 loops and M4 rejects → M5 loops

Indeed this resulting Turing machine M5 recognizes a R.E. language, because it halts if and
only if a given instance belongs to the language, and loops otherwise.

The infinite set of the instances of the problem is {a, b}* from L1, L2, union with {a, b, c}*
from L3. So we have {a, b}* ∪ {a, b, c}* = (because {a, b}* ⊆ {a, b, c}* ) = {a, b, c}*

Answer to (ii):

Let’s consider the Halting problem and the language associated to it

Halt = { <M> | M halts on a given w}

This problem is known to be undecidable and semidecidable.


Let’s consider another problem:

Haltall = { <M> | ∀w ∈ Σ*. M halts on w}

Haltall is known to be not semidecidable.

So let’s pick

• P = Haltall
• Q = Halt

P is a proper subset of Q because we have that if there exists any Turing machine M which
always halts on any input, it must halt also on a given w, so the coding <M> of this T.M. M
must be an element of both P and Q.

On the contrary, there may exist at least one Turing machine M which halts on a given w but
does not halt on any input. This coding <M> would be an element of Q but not of P.

Automi e linguaggi !17


• CORRECTNESS

{N≥0}

n = N; res = 1;
while ( n > 1 ) do
res = res * ( n2 - n ); n = n - 2;
od

{ res = N! }

Let’s try to run the program with two given numbers 5 and 6:

for N = 5:

before while) n = 5 res = 1


after 1st loop) n = 3 res = 1*20 = 1*5*4
after 2nd loop) n = 1 res = 1*20*6 = 1*5*4*3*2

for N = 6:

before while) n=6 res = 1


after 1st loop) n=4 res = 1*30 = 1*6*5
after 2nd loop) n=2 res = 1*30*12 = 1*6*5*4*3
after 3rd loop) n=0 res = 1*30*12*2 = 1*6*5*4*3*2

It is clear that the invariant of the loop may be:

I = { 0 ≤ n ≤ N ∧ res*n! = N! }
Let’s prove the correctness of the given program:
To do so we need to show that these implications hold:

i) Precondition ⇒ I [ n/N, 1/res ] (Upon initialization)

ii) I ∧ tcondition while ⇒ I [ res * (n2 - n)/res, n - 2/n ] (During loop)

iii) I ∧ ¬ tcondition while ⇒ Postcondition (Upon termination)

PROOF i) N≥0 ⇒ 0 ≤ N ≤ N ∧ 1*N! = N! this holds because:


0≤N≤N true
1*N! = N! true

Automi e linguaggi !18


PROOF ii) (0 ≤ n ≤ N ∧ res*n! = N! ∧ n > 1) ⇒ (0 ≤ n - 2 ≤ N ∧ (res*(n2-n)*(n-2)! = N!)
0 ≤ n ≤ N ∧ n > 1 implies that n must be 2 ≤ n ≤ N

ii.1) 2≤n≤N⇒0≤n-2≤N
this holds because n≥2 ⇒ n-2≥0 (they’re the same)
and n≤N ⇒ n-2≤N
(subtracting 2 to n, it’s obvious that n is still smaller than N)

ii.2) res*n! = N! ⇒ (res*(n2-n)*(n-2)! = N!


res*(n2-n)*(n-2)! = res*n(n-1)(n-2)! =
= res*n! = N! (because n(n-1)(n-2)! = n!)

Indeed we have that:


res*n! = N! ⇒ res*n! = N!

PROOF iii) (0 ≤ n ≤ N ∧ res*n! = N! ∧ n ≤ 1) ⇒ (res = N!)

Now we have that:


0≤n≤N∧n≤1 implies that n=0 ∨ n=1

So we have two cases:

iii.1) n = 0:
res*0! = N! ⇒ res = N! holds because 0! = 1

iii.2) n = 1:
res*1! = N! ⇒ res = N! holds because 1! = 1

For the termination of the program we can see its behavior from the following flowchart:

The termination is assured


because n decreases by 2 linearly
after every loop. So it will surely
reach the the values n ≤ 1
exiting the loop.

Automi e linguaggi !19

You might also like