You are on page 1of 4

COL352 Homework 1

Release date: February 16, 2021 Deadline: February 22, 2021: 23:00

Read the instructions carefully.

This homework is primarily about proving closure properties of regular languages. To prove that regular
languages are closed under some binary operation op, a straightforward way is to show how to construct,
for any two regular languages L1 and L2 , an NFA N recognizing op(L1 , L2 ) from DFAs D1 , D2 recognizing
L1 , L2 respectively. Remember to prove that N accepts a string if and only if the string belongs to op(L1 , L2 ).
You have to decide between whether to define N mathematically (eg. like we did in the proof of closure
under intersection), or to describe the construction informally in a human language (eg. for closure under
concatenation, “Connect every accepting state of D1 to the initial state of D2 by an ε-transition”). There
is a tradeoff here. If the mathematical definition is short, clean, and intuitive, write that and avoid giving
a vague informal description. If the mathematical definition is unnecessarily complicated and it hides the
main idea, avoid it and describe your construction informally but clearly.
Let Σ and Γ be two finite alphabets. A function f : Σ∗ −→ Γ∗ is called a homomorphism if for all
x, y ∈ Σ∗ , f (x · y) = f (x) · f (y). Observe that if f is a string homomorphism, then f (ε) = ε, and the values
of f (a) for all a ∈ Σ completely determine f .
1. Let x, y, z ∈ Σ∗ . We say that z is a shuffle of x and y if the characters in x and y can be interleaved,
while maintaining their relative order within x and y, to get z. Formally, if |x| = m and |y| = n, then
|z| must be m + n, and it should be possible to partition the set {1, 2, . . . , m + n} into two increasing
sequences, i1 < i2 < · · · < im and j1 < j2 < · · · < jn , such that z[ik ] = x[k] and z[jk ] = y[k] for all k.
Given two languages L1 , L2 ⊆ Σ∗ , define
shuffle(L1 , L2 ) = {z ∈ Σ∗ | z is a shuffle of some x ∈ L1 and some y ∈ L2 }.
Prove that the class of regular languages is closed under the shuffle operation.
Solution: We prove this by constructing an NFA which recognises shuffle of 2 regular languages.
Let D1 = (Q1 ,Σ,δ1 ,q1 ,F1 ) and D2 = (Q2 ,Σ,δ2 ,q2 ,F2 ). We use product construction to construct N
= (Q1 X Q2 ,Σ,∆,q1 X q2 ,F1 X F2 ) which recognises shuffle of L1 and L2, and use it to prove that
class of regular languages is closed under the shuffle operation.

Consider N to be (Q1 × Q2 ,Σ,δ,(q1 ,q2 ),F1 × F2 ), such that ∆ contains transitions of form ((s1 ,s2 ),a,
(δ1 (s1 ,a),s2 )) and ((s1 ,s2 ),a,(s1 ,δ1 (s2 ,a))) for all states of (s1 ,s2 ) and all actions a. We claim that
NFA N accepts a string s iff it belongs to shuffle(L1 ,L2 ).

Lemma: If string s belongs to shuffle(L1 ,L2 ), then it is accepted by NFA.

Proof: Let string str belong to shuffle, and we have already divided the positions into 2 disjoint sub
sequences so that each character belongs to one of 2 strings str1 ∈ L1 and str2 ∈ L2 . Starting from
(q1 ,q2 ), we see if we consider the run where (si ,ti ) = (δ(si−1 , stri−1 ),ti−1 ) if stri−1 belongs to first
string, and (si−1 , δ(ti−1 , stri−1 )) if it belongs to second string. We see this results in effectively str1
running on first state, and str2 running on second state, leading to first and second states in pairs
ending at accepting states in D1 and D2 , and thus the final state ∈ (F1 × F2 ) and machine accepts
str.

Lemma: If string s is accepted by N, it belongs to shuffle(L1 ,L2 )

Proof: Consider a run on machine starting from initial state (q1 ,q2 ) on strings s, taking an ac-
cepted run (s0 =q1 ,t0 =q2 ), (s1 ,t1 ) ..... , (sn ,tn )). We see that in any 2 adjacent states (si ,ti ) and
(si+1 ,ti+1 ), we see either si = si+1 or ti = ti+1 . Consider 2 strings s1, s2, which are empty initially.
Starting from (s1 ,t1 ), and keep iterating to next state until we reach (sn ,tn ), and at each state (si ,ti )
we add character causing transition from previous stage to first string if si−1 = si else to 2nd string.
We see because 2 strings lead to independent transitions, and that string s1 leads q1 to correspond-
ing accepting state sn on DFA D1 , and string s2 leads q2 to corresponding accepting state tn on
DFA D2 (We can see this follows by state transition definition on NFA N), and thus overall string
s is a shuffle of 2 strings s1 ∈ L1 and s2 ∈ L2 , which character in string s belonging to string s1 if it
causes transition in first state of pair, and to second string if transition occurs in second state of pair.

Hence by above 2 lemmas we can say that NFA N construct from DFA of 2 regular languages
accepts string only if it belongs to the shuffle of the 2 languages, and thus it follows that regular
language class is closed under shuffle operation.

2. Let L1 be a regular language and L2 be any language (not necessarily regular) over the same alphabet Σ.
Prove that the language L = {x ∈ Σ∗ | x·y ∈ L1 for some y ∈ L2 } is regular. Do this by mathematically
defining a DFA for L starting from a DFA for L1 and the language L2 .

Solution: For this question, we will mathematically define a DFA for L, starting from DFA for L1
and the language L2 .
Let DL = (Q, Σ, δ, q, A), we will find DFA for L1 DL1 in terms of tuples of DL = (QL , Σ, δL , qL , AL )
as
QL = Q
δL = δ
qL = q
AL = q ∈ Q| δ ∗ (q, y) ∈ A for some y ∈ L2
Now, If we prove DFA DL is DFA for language L, then we are done.
Claim: If x ∈ L, then DFA DL accepts x.
Proof: We will prove the above claim by contradiction. Suppose, there exists a x ∈ L such that
x ∈ Σ∗ | x · y ∈ L1 for some y ∈ L2 and x is not accepted by DL . Let the final state for input x
be q which must not be an accepting state because of our assumption that x is not accepted by
DL . But, δ ∗ (q, y) ∈ A∀q ∈ Q. So, by our construction q must be an accepting state which in turn
contradict our assumption that DL doesn’t accepts x. Hence, ∀x ∈ L, DL accepts x.
Claim: If x is accepted by DL , then x ∈ L.
Proof: We will prove the above claim by contradiction only. Suppose ∃x which is accepted by DL ,
but x 6∈ L. Let q be the accepting state for that x in DL . By our construction, q can only be the
accepting state iff ∃y ∈ L2 : δ ∗ (q, y) ∈ A . Also, for that y ∈ L2 : x · y ∈ L1 which means that x ∈ L
which contradicts our assumption that x 6∈ L. Hence, id x is accepted by DL , then x ∈ L.
Using above two claims, we can prove that DFA DL is a DFA for L.

3. Prove that the class of regular languages is closed under inverse homomorphisms. That is, prove that if
L ⊆ Γ∗ is a regular language and f : Σ∗ −→ Γ∗ is a homomorphism, then f −1 (L) = {x ∈ Σ∗ | f (x) ∈ L}
is regular. Do this by mathematically defining a DFA for f −1 (L) starting from a DFA for L and the
function f .

Solution: For this we construct a DFA D’ from original DFA D for Language L which accepts
f −1 (L). Let D = (Q, Γ, δ, q, A), construct D’ = (Q, Σ, δ 0 , q, A) such that δ 0 (q, a) = δ ∗ (q, f (a)). If
f(a) is more than one character, we follow the characters from q until we reach destination state.
We show that D’ accepts strings s iff x ∈ f −1 (L).

Forward implication follows from the fact that if string s is applied on D’, due to property of homo-
morphism f(xy) = f(x)f(y), we can conclude that we are effectively applying the string f(s1 )f(s2 )...f(sn )
= f(s) on DFA D, and because f(s) ∈ L, s ∈ f −1 (L).

Similarly for backward implication, assume that s ∈ f −1 (L). Thus f(s) is accepted by D. That
is f(s1 )f(s2 )...f(sn ) is accepted by D. Or by definition of transitions for D’,s1 s2 ...sn is accepted by
D’, and thus D’ accepts all strings in f −1 (L).
Hence our mathematically defined DFA accept strings belonging to the language defined by f −1 (L),
hence the class of regular languages is closed under inverse homomorphism.

4. Prove that the class of regular languages is closed under homomorphisms. That is, prove that if L ⊆ Σ∗
is a regular language, then so is f (L) = {f (x) | x ∈ L}. Here, it is advisable to informally describe how
you will turn a DFA for L into an NFA for f (L).

Solution: As noted, the values of f (a) for all a ∈ Σ completely determine the homomorphism.
Now as L is regular, it has a corresponding automaton D which accepts only L. Now we have to
somehow modify this automaton D to form N (which may be an NFA) which accepts the language
f (L). Homomorphism preserves structure so each alphabet is mapped to a corresponding string
thus homomorphism of concatenation is the concatenation of homomorphism. Formally, if the
string is x = x1 x2 ...xn then we have f (x) = f (x1 x2 ...xn ) = f (x1 ).f (x2 )...f (xn ). Note that we
change states (or, make a transition) in D on each of xi . But for the NFA N , we should make a
transition instead on getting f (xi ). But f (xi ) might have more than one alphabet. So we introduce
new intermediate (dummy) states that bridge the gap between the original state transition. Now
the accepting states as well as the starting state remain the same. Given below is an illustration of
how the intermediate states are introduced.

Thus we form the NFA N which accepts only f (L) so we have proved that the class of regular
languages is closed under homomorphism.
We note that the NFA N indeed accepts only f (L) as for any x ∈ L, we have f (x) ∈ N by
construction as it is guaranteed to have a run in N that ends in an accepting state. Similarly for
any string accepted by N , we trace back through the intermediate states about the string in L that
corresponds to the succesful run in N . Note that this is always possible as the intermediate states
can be named in such a way to give information about the corresponding alphabet transition in L
(as shown in figure intermediate states are names sti to indicate that they correspond to s to t state
transition in D). Thus the NFA N corresponds to language to f (L).

5. Prove that if L ⊆ Σ∗ is a regular language then the language L0 = {x ∈ Σ∗ | x · rev(x) ∈ L} is also


regular, where rev(x) is the reverse of string x. Here, instead of constructing an NFA for L0 directly, it
could be more convenient to use the already proven closure properties. For example, it might be better
to write L0 as a union of a finite collection of languages, and then construct an NFA for each language
in that collection.
Solution: Let us first partition the language L0 based on the state x ∈ L0 ends up in when run
on the automaton D for the regular language L. Let D have the states: q0 , q1 , ..., qn . So each
partition contains strings whose run end up in some fixed state qi . Thus x ends up at qi while
rev(x) starts from qi and ends up in some accepting state of the automaton D. This gives us idea
about how to create an NFA N for each partition. We use the product-construction to simulate
runs one starting from the original starting state and the other starting from the accepting states
and going backwards. Formally, let us define the automaton N . Let D = (Q, Σ, δ, q0 , F ) be the
automaton for the language L, then we have N = (Q × (Q ∪ {s, i}), Σ, ∆, (q0 , s), (qi , qi )). Here the
state s is connected by -transitions to all the states in the accepting state F . The i state stands
for the invalid state. Now, we define the relation ∆. ∆ contains the forward as well as backward
transitions so it is of the form ((t1 , t2 ), a, (δ(t1 , a), x)) where t1 , t2 ∈ Q and x is any state in the
set X which contains states such that δ(x, a) = t2 , if X is empty then x = i, the invalid state.
∆ also contains invalid transitions ((t1 , i), a, (δ(t1 , a), i)) and we can also include the -transitions
from state s for consistency namely relations of the form ((q0 , s), , (q0 , fi )) where fi ∈ F . Thus ∆
is the union of the three sets of relation described above. These relations simulate both forward
run of x as well as backward run of x (starting from accepting states) and accept the string only if
a run simultaneously ends up at qi . This NFA is only a partition of the language L0 . Other NFAs
are obtained by only varying the accepting state to cover all the states of Q. Now as the number
of states is finite, we obtain a finite union of regular languages which is also regular thus L0 is also
regular.
Note: We could have also made the accepting state F 0 = (q, q) where q ∈ Q which would have
directly given us the required NFA for L0 .

6. Design an algorithm that takes as input the descriptions of two DFAs, D1 and D2 , and determines
whether they recognize the same language.

Solution: In this question we need to provide an algorithm to determine whether two DFAs D1
and D2 recognise the same language or not.

Algorithm: Let’s suppose L(D1 ) and L(D2 ) be languages for DFAs D1 and D2 . In this problem we
will just check whether L(D1 ) ∩ L(D2 ) and L(D1 ) ∩ L(D2 ) are φ or not. If both of them are φ, both
the DFAs identify the same language, else both of them identify different language. WLOG, Let’s
look at algorithm for checking whether L(D1 ) ∩ L(D2 ) is φ or not.
First, Let’s find DFA construction for L(D1 ) using D1 . It can be easily constructed by converting
accepting states to non-accepting states and non-accepting states to accepting states. Let the DFA
thus formed be D1 . Now, we just need to check intersection for languages of DFAs D2 and D1 . Let
D = (Q, Σ, δ, q, A) be the DFA which accepts intersection of languages of D2 and D1 .
Let’s assume D1 = (Q1 , Σ, δ1 , q1 , A1 ) and D2 = (Q2 , Σ, δ2 , q2 , A2 ).
We will construct D as
Q = Q1 × Q2
q = (q1 , q2 )
A = A1 × A2
δ((p1 , p2 ), a) = (q1 , q2 ), iff δ1 (p1 , a) = q1 and δ2 (p2 , a) = q2 .
Now, for checking whether the intersection in D is φ or not, we just need to check is there a path
possible in D from starting state to any accepting state. If there is a path between starting state and
any accepting state, the intersection is not φ, else the intersection is φ. We will check the existence
of path by running DFS/BFS algorithm from start state, if at the end of DFS/BFS algorithm any
of the accepting states is visited, then the intersection is not φ, else the intersection is φ.
Similarly we will check for whether L(D1 ) ∩ L(D2 ) is φ or not. If both of the intersection is φ then
return true, else return false.

Proof of Correctness: Two sets S1 and S2 are same iff S1 ∩ S2 and S1 ∩ S2 are φ. We have used
the above fact for checking whether the languages are same or not. Correctness of reachability part
comes from correctness of DFS/BFS algorithm.

You might also like