You are on page 1of 4

Solution Set 1

CS 475 Fall 2006


Problem 1 Let L be an arbitrary language (i.e., not necessarily regular). Let C0 = L and dene + languages Si and Ci for all i 1 by Si = Ci1 and Ci = Si . 1. Does S1 always equal, never equal, sometimes equal C2 ? Justify your answer. Solution: First let L = C0 = + . Then S1 = L+ = + = L. Therefore, C1 = S1 = { }. Now + S2 = C1 = { }+ = { }. Consequently C2 = S2 = + . Thus S1 = C2 . Now, x alphabet = {a, b} and let L = C0 = {a, b}. S1 = L+ will then be the set of all strings consisting of any number of abs. C1 = S1 will be all strings not of this form. Particularly, C1 + includes single symbol strings a and b. Thus S2 = C1 , contains all possible strings made of + a and b, or in other words S2 = . C2 = S2 = { }. Obviously S1 = C2 . Thus S1 and C2 may or may not be equal. 2. Prove S2 = C3 , no matter what L is. Hint: Prove that C2 is closed under concatenation, i.e., if x, y C2 then xy C2 .
+ Solution: We will prove C2 is closed under concatenation. Once this is proved, C2 = C2 . Thus + S2 = C2 = C2 = S3 = C3 and we will be done.

Now, to prove that C2 is closed under concatenation, suppose x and y are two strings in C2 . We wish to prove that xy C2 as well. From the denitions, C2 = L+
+

Thus C2 is the set of all strings that cannot be written as the concatenation of a number of words, none of which a concatenation of the words in L+1 . Equivalently, anyhow we break a word x C2 into x1 x2 . . . xk , at least one of x1 , x2 , . . . , xk belongs to L+ . Now suppose we can break w = xy into w1 . . . wl with none of w1 , . . . , wl belonging to L+ . Clearly, since x C2 , for no 1 i l can x = w1 . . . wi since otherwise wj L+ for some 1 j i. Thus for some 1 i l, wi can be written as wi = uv such that x = w1 . . . wi1 u and y = vwi+1 . . . wl . Since x C2 and w1 , . . . , wi1 L+ , by our equivalent denition of C2 , u L+ . Similarly, since wi+1 , . . . , wl L+ , it can only be the case that v L+ . But if u L+ and v L+ we have wi = uv L+ contradicting our assumption on wi s. Therefore, w = xy C2 . Problem 2 Let L = {w | w contains an equal number of occurances of the substrings 01 and 10}. This means that 101 L because 101 contains a single occurance of 01 and a single occurance of 10. On the other hand, 1010 L as it has two 10s and one 01. Construct a nite automaton recognizing L. Solution: At rst glance L does not seem to be regular since to recognize it we look to have to keep track of two unbounded quantities (number of occurances of 01 and number of occurances of 10) which can take an arbitrarily large amount of memory while a DFA always has a nite (constant) amount of memory. The crucial observation is the fact that the occurances of 01 and 10 appear alternatingly in eachstring of 0s and 1s. To be more precise, dene a run of 0s in a string w {0, 1} to be a maximal 1

substring of w consisting of consecutive 0s and similarly dene a run of 1s. A 01 occurs only at the end of a run of 0s and a 10 occurs only at the end of a run of 1s. Since runs of 0s and 1s appear alternatingly in every substring of 0s and 1s, a string w {0, 1} happens to have equal number of 01s and 10s if and only if w starts and ends with the same symbol. This is easily checked by the following NFA.

Problem 3 Let L1 , L2 , L3 be some languages over an alphabet . Dene Vote(L1 , L2 , L3 ) = {w | w is in at least two of {L1 , L2 , L3 }} Prove that Vote(L1 , L2 , L3 ) is regular if L1 , L2 , L3 are regular. In particular if machines M1 , M2 , M3 recognize L1 , L2 , L3 respectively, then construct a machine M to recognize Vote(L1 , L2 , L3 ). Prove your construction to be correct. Solution: The construction proceeds similarly to that for the intersection of two DFAs. In this case, we construct a DFA M that simulates all three DFAs M1 , M2 and M3 simultaneously. The states of M are 3-tuples, each component representing the state of the corresponding original DFA, i.e. the set of states Q of M is the Cartesian product Q1 Q2 Q3 . The only dierence is the denition of the accepting states of M . These are the states where at least two of the components represent accepting states of the original automata. In other words, the set of accepting states F of M is F = { q1 , q2 , q3 : (q1 in F1 AND q2 in F2 ) OR (q1 in F1 AND q3 in F3 ) OR (q2 in F2 AND q3 in F3 )}. The proof follows from the observation that M accepts a word w i at least two of the original DFAs would have accepted w. Problem 4 (For 4 hour graduate students only) A set X of natural numbers is said to be linear if for some a, b N, X = {a + bi | i 0}. Thus, linear sets are just those containing all the elements of some arithmetic progression. A set S is semi-linear if it is the union of a nite number of linear sets. 2

Consider a language L {0} , i.e., containing strings over a one element alphabet. Show that L is regular if and only if L = {0x | x S} for some semi-linear set S. Aside: Any string over {0, 1, . . . k 1} can be thought of as representing a natural number in base k. The exercise asks you to prove that a language over a unary alphabet is regular exactly when the set of strings represent a semi-linear set. This observation in fact holds for other bases as well. So a language over {0, 1, . . . k 1} is regular exactly when the strings correspond to numbers in some semi-linear set in base k. (Enthusiastic students are encouraged to think about proving this extension; please do not turn in solutions though.) This is another evidence of the naturalness of the class of regular languages. Solution: First we shall show that such languages are regular, then that they are the only such regular languages. The case when S is a nite set (i.e. when S = nite union of {a + bi : b = 0}) is not interesting since every nite set (language) is regular. When S is innite and is a nite union of linear sets S1 , S2 , . . . , Sn , we show how to construct an NFA that recognizes L. We rst construct DFA that recognize each of the linear sets S1 , S2 , . . . , Sn and then construct an NFA that recognizes their union. (a) Sj = {aj + ibj : i 0}: This set (language) is recognized by a DFA that has an initial segment starting from the initial state consisting of a sequence of aj edges, followed by a loop consisting of bj edges. The single accepting state is the state thats common to the initial tail and the loop. The words accepted by the DFA are those that have aj characters at the beginning followed by zero or more turns around the loop, each turn appending bj characters. Therefore, the words accepted are exactly of the form {aj + ibj : i 0}. (b) Take the n DFAs constructed above and add an extra state that will be the new initial state. Add -transitions from this new initial state to the initial states of the individual DFAs (which are no longer the initial states). The set of accepting states of the new machine is the union of the sets of accepting states of the inividual DFAs. Note that the new machine is non-deterministic (because of the -transitions) and nite (since we take the union of a nite number of nite automata). The proof proceeds similar to the proof of the closure of regular languages under (nite) union. The newly constructed NFA accepts a word i at least one of the original DFAs would have accepted that word. Now, we shall prove that L is regular only if L = {0x : x is in some semi-linear set S}, i.e. that every regular language is a semi-linear set. Since the alphabet of the DFA for L consists of a single character, every state of the DFA has a unique next state. Therefore, the graph of the DFA consists of a set of paths from the start state to each of the accepting states. There may be one or more loops attached to each path. We can separate the paths that share some segment by duplicating the shared portion. We can separate paths that have more than one loop into multiple paths each of which have a single loop attached.

Also, for any path that has a loop, we can merge the segment of the path trailing after the loop and leading to an accepting state into the leading segment that leads from the start state to the beginning of the loop. After the above modications, the DFA looks like a union of paths, each with a leading sequence of edges and ending in a loop. There may be one or more accepting states anywhere on each of the paths. Note that the above construction does not change the language being accepted by the DFA. Since the alphabet is unary, only the lengths of the strings in the language is important and the particular pattern of characters is irrelevant. The modication above does not aect the lengths of the paths from the start state to an accepting state, although it might change the sequence of edges taken. Consider one of the paths above, with a leading sequence of edges and followed by a loop of length bj . If the ith state in the leading sequence is an accepting state, then the DFA accepts the string 0i. If one of the states in the loop is an accepting state and it is reachable from the start state in at most aj steps, then every string of length aj plus a multiple of bj will be accepted, because it leads from the start state to that accepting state after going round the loop of length bj zero or more times. In the rst case, the language accepted is {0i} which corresponds to the (nite) set {a + bi : a = 0, b = 1}. In the second case, the language accepted corresponds to the (innite) set {aj + ibj : i 0}. The set of strings accepted by the DFA is the (nite) union of all strings accepted in either of these two ways.

You might also like