You are on page 1of 15
11.3 Constructing Efficient Finite Automata Start Final pe sure to draw a picture of this minimum-state DFA. 4 EXAMPLE [xAMPLE 4. We'll compute the minimum-state DFA for the DFA given by the following transition table: Start Final Final The set of states is S = {0, 1, 2, 3, 4, 5}. For Step 1 we g the sore sequence of relations: 624 Regular Languages and Finite Automata compute T'pin({1, 2, 3}, b) and Tyin({4, 5}, 2) a8 follows: Trin({Ly 2, 3}, 6) = Tpin(CL, b) = (71, 6)] = (11 = 1, 2, 8}, Twin({4, 5}, @) = Tmin(l4], a) = (74, a] = (41 = {4, 5}. Similar computations will yield the table for T'min, Which we'll leave as an) exercise. 4 Exercises 1. Use (11.7) to construct an NFA for each of the following regular expressions. a.a*b* b. (a+b). c. a* + b*. ‘ ‘ 2. Construct a DFA table for the following NFA in two different ways using (11.8): a. Take unions of lambda closures of the NFA entries. b. Take lambda closures of unions of the NFA entries. 11.3 Constructing Efficient Finite Automata a. a*b*, b. (a + 5)*. c. a* + b*, 6, Let the set of states for a DFA be S = {0,1, 2, 3,4, 5}, where the start state is 0 and the final states are 2 and 5. Let the equivalence relation on S for a minimum-state DFA be generated by the following set of equivalent pairs of states: {{0,1}, {0,4}, {1,4}, {2,5}. Write down the states of the minimum-state DFA. . Finish Example 4 by calculating the transition table for the new mini- mum-state DFA. Given the following DFA table, use algorithm (11.10) to compute the minimum-state DFA. Answer parts (a), (b), and (c) as you proceed through the algorithm. a > a. Write down the set of equivalent pairs. b. Write down the states of the minimum-state DFA. c. Write down the transition table for the minimum-state D) 9. For each of the following DFAs, use (11.10) to find the mini 626 Regular Languages and Finite Automata 11. Suppose we're given the following NFA table: a b A 7 , “meee {2} Start 0 {1} {1} (2) 1 eee i) Final 2 @ fo} A} Find a simple regular expression for the regular language recognized by this NFA. Hint: Transform the NFA into a DFA, and then find the minimum-state DFA. 12. What can you say about the regular language accepted by a DFA in which all states are final? 11.4 Regular Language Topics We've already seen characterizations of regular languages by regular express- ions, languages accepted by DF As, and languages accepted by NFAs. In this section we'll introduce still another characterization of regular languages in terms of certain restricted grammars. We'll discuss some properties of regular languages that can be used to find languages that are not regular. Regular Grammars A regular language can be described by a special kind of grammar in ¥ the productions take a certain form. A grammar is called a regular if each production takes one of the following two forms, where a letters are nonterminals and w is a string of terminals or A: ‘Seu EST 11.4 Regular Language Topics Regular Expression Regular Grammar ao S> Alas (a + b)* S+AlaS|bS a* + b* SA|A|B A-alaA B+b\|bB a*b SblaS ba* SbA AAlaA (ab)* SAlabS The last three examples in the preceding list involve products of lan- guages. Most problems occur in trying to construct a regular grammar for a language that is the product of languages. Let’s look at an example to see whether we can get some insight into constructing such grammars. Suppose we want to construct a regular grammar for the language of the regular expression a*bc*. First we observe that the strings of a*bc* start with either the letter a or the letter b. We can represent this property by writing down the following two productions, where S is the start symbol: S—aS|bC. These productions allow us to derive strings of the form 6C, so on. Now all we need is a definition for C to derive the 628 Regular Languages and Finite Automata S>AlaS|B B-b|bB. Let’s look at four sublanguages of {a"b"|m, n EN} at oe defined a each string contains occurrences of a or b. The fol eae each language together with a regular expression and a regular grammar. Language Expression Regular Grammar {a"b"|m>0 and n>0} a*bb* S>aS|B B= b|bB. {a"b"|m>0 andn>0} aa*b* S+aA A-aA|B BA\bB. {a"b"|m>0 and n>0} —aa*bb* SsaA A-—aA|B B+ b|bB. {a"b"|m > 0 or n> 0} aa*b* +a*bb* S—+aA|bB A> AlaA|B * BA\bB. 4 7 Any regular language has a regular grammar; conversely, any grammar generates a regular language. To see this, we'll give two algori one to transform an NFA to a regular grammar and the other to trans regular grammar to an NFA, where the language accepted by the } identical to the language generated by the regular grammar. NFA to Regular Grammar Perform the following steps to construct a regular g ates the language of a given NFA: mi 1, Rename the states to a set of capital letters, 2. The start symbol is the NFA’s start state, J labeled 3. For each transition from J to 11.4 Regular Language Topics 629 acceptance path in the NFA for some string corresponds to a derivation by the grammar for the same string. Let’s do an example. EXAMPLE 2, Let's see how (11.11) transforms the following NFA into a regular grammar: Start The algorithm takes this NFA and constructs the following regular grammar with start symbol S: Salld I>bK J>adJ|aK K-A. edges labeled A, a, a, respectively. The grammar derives this string with For example, to accept the string aa, the NFA follows the path S, J, J, Kwith following sequence of productions: ~— ag 5 __ Now let’s look at the c 630 Regular Languages and Finite Automata Let’s look at an algorithm to transform a regular grammar into Regular Grammar to NFA Perform the following steps to construct an NFA that accepts the la of a given regular grammar: 1. If necessary, transform the grammar so that all productions have the form A +x or A+ «B, where x is either a ae letter or A. 2. The start state of the NFA is the grammar’s start symbol. 3. For each production Jad, construct a state transition from 7 to. labeled with the letter a. j 4. For each production JJ, construct a state transition from I to J labeled with A. 5. If there are productions of the form J >a for some letter a, then cr a single new state symbol F. For each production Ia, consti state transition from J to F labeled with a. 6. The final states of the NFA are F together with all I for which a production J > A. End of Algorithm It’s easy to see that the language of the NFA is the same as of the given regular grammar because the productions used in of any string correspond exactly with the state transitions o . t acceptance for the string. Here’s an example. § v 11.4 Regular Language Topics 631 5 Properties of Regular Languages guag We need to face the fact that not all languages are regular. To see this, let’s Jook at a classic example. Suppose we want to find a DFA or NFA to recognize the following language: {a"b"|n > 0}. After a few attempts at trying to find a DFA or an NFA or a regular expression or a regular grammar, we might get the idea that it can’t be done. But how can we be sure that a language is not regular? We can try to prove it. A proof usually proceeds by assuming that the language is regular and then trying to find a contradiction of some kind. For example, we might be able to find some property of regular languages that the given language doesn’t satisfy. So let’s look at a few properties of regular languages. A useful property of regular languages comes from the observation that any DFA for an infinite regular language must contain a cycle to recognize infinitely many strings. For example, suppose we have a DFA with n states that recognizes an infinite language. Since the language is infinite, we can pick a string with more than n letters. To accept the string, the DFA must enter some state twice (by the pigeonhole principle). So any DFA for an infinite regular language must contain a subgraph that is a cycle. We can symbolize the situation as follows, where each arrow represents a path that may contain other states of the DFA and x, y, and z are the strings of letters along each path: 632 Regular Languages and Finite Automata This grammar accepts all strings of the form xy*z for all k > 0. For exam the string xy°z can be derived as follows: aN = xyN = xyyN = xyyyN = xyyyz. The property that we've been discussing is called the pumping property because the string y can be pumped up to y* by traveling through the same loop k times. Our discussion serves as an informal proof of the following pumping lemma: Pumping Lemma (Regular Languages) (11.13) Let L be an infinite regular language over the alphabet A. Then there exist strings x, y,z¢A*, where y # A, such that xy'zeL for all k > 0. If an infinite language does not satisfy the conclusion of (11.13), then it can’t be regular. So we can sometimes use this fact to prove that a language is not regular. Here’s an example. EXAMPLE 4. Let’s show that the language L = {a"b"|n > 0} is not regular. We'll assume, by way of contradiction, that L is regular. Then we can use the pumping lemma (11.13) to assert the existence of strings x, y, z (y # A) such that xy'zeL for all k > 0. Let k = 1. Then there is some n such that xyz =a"b", Since y # A, there are three possibilities for y: 1. y consists of all a’ 2. y consists of all b’s. a in TOR re 3. y consists of one or more a’s followed by one or more 11.4 Regular Language Topics 633 the language is regular. The reason is that palindromes satisfy the conclusion of (11.13). For example, over the alphabet {a,b}, suppose we let x = a, y = aba, and z = a. Then certainly xy"z is a palindrome for all n 2 0. There is another version of the pumping lemma for infinite regular languages. It can sometimes be used when (11.13) does not do the job, because it specifies some additional properties of infinite regular languages. We'll state this second pumping lemma as follows: Second Pumping Lemma (Regular Languages) (1.14) Let L be an infinite regular language over the alphabet A, and suppose that L can be accepted by a DFA with m states. Then for every string seL such that |s| > m there exist strings x, y, z¢A*, where y # A, such that s = xyz and |xy| 0. The proof of (11.14) is a refinement of the discussion preceding (11.13), and we'll leave it as an exercise. Let’s see how (11.14) can be used to prove that the language consisting of all palindromes over an alphabet with two or more letters is not regular. EXAMPLE 5. Suppose P is the language of palindromes over the alphabet of two letters a and b. To prove that P is not regular, we'll assume that P is regular and try for a contradiction. If P is regular, then there is a DFA with m states to recognize P. Let’s choose a palindrome of the following form: saa" tbat}, UW s * zh ‘ aT. nbhini 130 Now we can apply (11.14) to assert the existence of strings x, yz such * ‘ 6 ah \ 5 that oo 7 634 Regular Languages and Finite Automata nd closure to form combined b language product, a d by union, language p' ns ne with twOetaa languages. We'll list these three properties Properties of Regular Languages 1. The union of two regular languages is regular. 2. The language product of two regular languages is regular. 3. The closure of a regular language is regular. 4. The complement of a regular language is regular. 5. The intersection of two regular languages is regular. We'll prove statement 4 and leave statement 5 as an exercise. If D is a DFA for the regular language L, then construct a new DFA, say D’, from D by making all the final states nonfinal and by making all the nonfinal states final. If we let A be the alphabet for L, it follows that D’ recognizes the complement A* — L. Thus the complement of L is regular. QED. 18 EXAMPLE 6. Suppose L is the language over alphabet {a, b} consisting of all strings with an equal number of a’s and 6’s. For example, the strings abba, ab, and babbaa are all in L. Is L a regular language? To see that the answer is no, consider the following argument: Bar. Let M be the language of the regular expression a*b*. It certainly that L nM = {a"b"|n > 0}. Now we're in a position to use (11.15). Supy the contrary that L is regular. We know that M is regular because language of the regular expression a*b*. Therefore L 7M must be reg (11.15). In other words, {a"b"|n > 0} must be regular. But we know th {a"b"|n > 0} is NOT regular. Therefore our assumption that L is regular to a contradiction. Thus L is not regular, < aa We'll finish by listing two interesting Properties of regular la can also be used to determine nonregulari a - larity. 11.4 Regular Language Topics of the form S > f(w) and S > f(w)T. The new grammar is regular, and any string in f(L) is derived by this new grammar. QED. EXAMPLE 7. Let's use (11.16) to show that the language L = {a"bo"|neN} is not regular. We can define a morphism f:{a, 6, c}* > {a, b, c}* by f(a) =a, f(b) = A, and f(c) = 6. Then f(L) = {a"b"|n > 0}. IfL is regular, then we must also conclude by (11.16) that f(L) is regular. But we know that f(L) is not regular. Therefore L is not regular. There are some very simple nonregular languages. For example, we know that the language {a"b"|n > 0} is not regular. Therefore finite automata are not powerful enough to recognize it. Therefore finite automata can’t recognize some simple programming constructs, such as nested begin-end pairs, nested open and closed parentheses, brackets, and braces. We'll see in the next chapter that there are more powerful machines to recognize such constructs. Exercises 1. Find a regular grammar for each of the following regular expressions. aa+b. b. a+be. c. a+b* d. ab* +c. e. ab* + be*. f. a*bc* + ac. g. (aa + bb)*. h. (aa + bb)(aa + bb)*. i, (ab)*c(a + b)*. Find a regular grammar to describe each of the following languages. a. {a, b, c}. isan, abjac}s c. {a, b, ab, ba, abb, baa,..., ab", ba",. d. {a, aaa, aaaaa,...,a7"**, . {A, a, abb, abbbb,..., ab",...}. . {A, a, b,c, aa, bb, ce,.:., a", b", c",...}. . {A, a, b, ca, be, cca, bec,..., c"a, be”,...}. fake U(™ REN. _ i, {a"be"|m, ne F om ‘Find a regula nite Automata 636 Regular Languages and Fin *abb represel 5. It’s easy to see that the regular expression (a + b)*abl pI : language recognized by the following NFA Start Use (11.11) to find a grammar for the language represented by the NFA. ; 6. Use (11.12) to construct an NFA to recognize the language of the following regular grammar: SallbJ I> b1|A JoaJ\A. ‘ a 7. The language {a"b"|n > 0) is not regular. If it is regular, the pumping lemma asserts the existence of strings x, y, z (y # A) such that xy'z is in the language for all k > 0. Let k = 1. Then there is some n such t xyz =a"b". Show that it is impossible for y to have the form: one | followed by one or more b’s. . Show that each of the following language: _{a’ba"|neN}. —b. {a?|p isa pri Pr hat the intersection of two 11.4 Regular Language Topics 637 c. Define a morphism g: {a, b, c}* > {a, b, c}* such that g({a"be"|neN}) = {a"b"|nEN}. d. Argue that {a"ba"|neN} is not regular by using parts (a), (b), and (c) together with (11.15) and (11.16). Chapter Summary The chapter introduced several formulations for regular languages. Regular expressions are algebraic representations of regular languages. Finite auto- mata—DFAs and NFAs—are machines that recognize regular languages. Regular grammars derive regular languages. The most important point is that there are algorithms to transform any one of these formulations into any other one. In other words, each formulation is equivalent to any of the other formulations. There is also an algorithm to transform any DFA into a minimum-state DFA. Therefore we can start with a regular language as either a regular expression or a regular grammar and automatically trans- form it into a minimum-state DFA to recognize the regular language. We also observed some other things. Finite automata can be used as output devices—Mealy and Moore machines. There are some very simple interpreters for DFAs and FAs. There are some basic properties of regular languages given by the pumping lemmas, set operations, and m« ‘ism Many simple languages are not regular. For example, the nonregularity of the language {a"b"|n > 0} tells us that finite automata can’ simple ‘constructs such as nested begin-en

You might also like