Hierarchy of the Models

1

7. Hierarchy of the Models: Characterization
We have studied four classes of formal languages and their grammars, four classes of automata and their variations, and other models for representing or generating languages such as regular expressions, syntax flow graphs and L-systems. In this chapter we will study the relations, called the Chomsky hierarchy, between these models. The Chomsky hierarchy reveals two important relationships, characterizations and containments, among the four classes of languages and automata. The characterizations show the relations between the models for the language generation (i.e., grammars) and those for language recognition (i.e., automata) or expression. For example, a language L can be generated by a regular grammar if and only if it is recognizable by an FA, and a language L is recognizable by an FA if and only if it is expressible by a regular expression. The containments show the set relations among the classes of languages generated (recognized) by the four types of grammars (respectively, automata). For example, the class of languages generated (recognized) by context-free grammars (respectively, by PDA) properly contains the class of languages generated (recognized) by regular grammars (respectively, FA). The same set relation holds among the classes of languages generated (recognized) by type 0, 1, and 2 grammars (respectively, TM, LBA, and PDA). In terms of computational capability, this containment relation between the four classes of languages recognized by TM, LBA, PDA, and FA implies that anything computable by an LBA is also computable by TM, but not necessarily the other way around, and anything computable by a PDA is also computable by LBA, and so on. Therefore the Chomsky hierarchy provides valuable information for designing an efficient computational model, as well as for analyzing the computational capability of a given model. This chapter proves the characterization of the models at the lowest level of the hierarchy, i.e., regular grammars, FA, and regular expressions. Laying the groundwork through the following chapters, we will complete the proof of the hierarchy in Chapters 12 and 15.

2

Hierarchy
7.1 Chomsky Hierarchy 187 7.2 Proof of Characterization 191 Constructing an FA from a regular grammar G that recognizes L(G). Constructing a regular grammar from an FA M that produces L(M). Constructing an FA from a regular expression R that recognizes L(R). Constructing a regular expression from an FA M that expresses L(M). Rumination 215 Exercises 216

Today’s Quote

Break Time

You’re alive. Do something. The directive in life, the moral imperative was so uncomplicated. It could be expressed in single words, not complete sentences. It sounded like this: Look. Choose. Act. - Barbara Hall -

3

Hierarchy

7. 1 Chomsky Hierarchy
The Chomsky hierarchy shows the characterization relations between two different types of models as illustrated below. For example, L is a language generated by type 0 grammar if and only if it is recognized by a TM. In this chapter we will only prove the characterization relations between regular grammars, FA’s and regular expressions. We defer the proofs for other characterizations till Chapter 15. These proofs are challenging, and usually included in a graduate course.
.

Type 0

TM

CSL

LBA Regular exp

CFL

PDA

Regular

FA

4

Chomsky Hierarchy

Hierarchy

For i ∈ {0, 1, 2, 3}, let TYPE-iL be the class of languages generated by type i grammars. The Chomsky hierarchy shows the proper containment relation among the four classes of languages as illustrated by the Venn diagram below. The class of regular languages is properly contained in the class of context-free languages. The class of context-free languages is properly contained in the class of context-sensitive languages which is in turn contained in the class of type 0 (phrase-structured) languages. In Chapter 12, we will see an elegant proof of these containment relations.

Type-0L Type-1L

Type-3L Type-2L

5

Chomsky Hierarchy

Hierarchy

The characterization and containment relations of the four classes of languages imply the hierarchy in the computational capability of the four types of automata, as illustrated in the figure below. Every language recognized by an FA can be recognized by a PDA, and every language recognized by a PDA can be recognized by an LBA, and so on. TM

LBA PDA FA

The figure in the following page shows the summary of these relations, called the Chomsky hierarchy (named after Noam Chomsky, who defined the four classes of languages), among the models that we have studied, as well as some interesting models investigated by researchers. This hierarchy is a beautiful piece of knowledge that computer scientists have gained through the advancement of the field.
6

Chomsky Hierarchy
Languages (grammars) Recursively Enumerable Sets (type 0) Context-sensitive Languages(type 1) Automata Other Models

Hierarchy

Turing Machines Linear-bounded Automata

Post System, Markov Algorithms, µ -recursive Functions . . . ↓ : Containment ↔ : Characterization Regular Expression
7

Context-free Languages(type 2)

Pushdown Automata

Regular Languages(type3)

Finite State Automata

Hierarchy

7.2 Proof of the Characterization
The theorem below states the characterization relations between two models at the lowest level of the Chomsky hierarchy, which is illustrated by the following figure. We will prove this theorem. Regular languages Regular expression

FA

Theorem. Let L be a language. (1.1) If L can be generated by a regular grammar, then there is an FA that recognizes L. (This implication will be denoted by RG ⇒ FA.) (1.2) If L is recognizable by an FA, then L can be generated by a regular grammar (FA ⇒ RG). (2.1) If L is expressible by a regular expression, then there is an FA that recognizes L (Re ⇒ FA). (2.2) If L is recognizable by an FA, then L can be expressible by a regular expression (FA ⇒ Re).
8

Proof (1.1): RG ⇒ FA

Hierarchy

Let G be a regular grammar. For the proof, we will present how to construct an FA to recognize L(G). Suppose that A and B are two arbitrary nonterminal symbols, and a and b are terminal symbols. The following figure shows how to transform typical production rules that will appear in a regular grammar into the state transitions of an FA which recognizes the language generated by the grammar. (Notice that heavy circle denote an accepting state.) Rules A → abB | B A →ε S is the start symbol A →a A State transitions a ε
A

b

B Let the state with label A be an accepting state. Let the state with label S be the start state. Put a transition from state A to a new accepting state.
9

start S A a

Proof (1.1): RG ⇒ FA

Hierarchy

Example. Transforming a regular grammar to an FA
The following example would be enough to illustrate how to transform an arbitrary regular grammar into an FA which recognizes the language generated by the grammar. a c start c S C a S → abcS | cC A → aS | B c a a a ε A B B → aA | a | ε C → aB | ac b a For the proof, let ∑ be the set of terminal symbols of G. We must show that for every string x ∈ ∑ ∗, x ∈ L(G) if and only if x is accepted by the FA that we have constructed by the above idea. Suppose that string x is derivable by the grammar as follows. S => w1V1 => w1w2V2 => . . . => w1w2 . . .wn-1 Vn-1 => w1w2. . .wn-1 wn = x, where Vi and wi are, respectively, a nonterminal symbol and a terminal string (or ε ) such that Vi → wi+1 Vi+1 is a rule in G. We can prove that that the string x is accepted by the FA, and the converse is also true. (We leave the detailed proof for the reader.)
10

Hierarchy

Proof (1.2): FA ⇒ RG
Given an arbitrary FA, we transform the FA into a regular grammar as follows. First, label the start state with the designated symbol S and others state with an arbitrary (distinguishable) nonterminal symbol. Then transform each transition into a rule as shown below. State transition a A b, c B A → bB | cB | aA Production rule

start
A

S

Let S be the start symbol. A →ε

11

Proof (1.2): FA ⇒ RG Example: Transforming an FA into a regular grammar
a start a b b b c a a a start Label the states a S a b A b B a b C

Hierarchy

c a D

a

E

A N I

Transform each transition into a rule G: S → aS | aA D → aC A → bB B → bB | bS | aD | ε E →ε

C → aB | cE

To complete the proof, we need to show that for every string x, the FA accepts x if and only if the grammar G generates it. (The detailed proof is left for the reader.)

12

Hierarchy

Proof (2.1): Re ⇒ FA
Let R be a regular expression which expresses a regular language L. We construct an FA which recognizes L. Let Σ be the alphabet of L. Going along the inductive definition of regular expressions, we will show how to construct an FA recognizes the language expressed by R. (1) If R is a regular expression which is either φ , ε , or a, for a symbol a ∈ Σ , which, respectively, express the language φ (the empty set), {ε }, or {a}, we construct an FA which recognizes the respective language as follows:

φ
start

ε

a
start

a

start

(2) Suppose that we have constructed two FA M1 and M2 , which recognize the languages expressed by regular expressions r1 and r2, respectively: L(M1) = L( r1) L(M2) = L(r2)

13

Proof (2.1): Re ⇒ FA

Hierarchy

(3) Using M1 and M2 , we construct FA M1+2 , M12 , and M1*, which, respectively, recognize the language expressed by the regular expressions r1+ r2, r1r2, and (r1)*as follows. To construct the FA M1+2 , introduce a new start state and link it to the start state of M1 and M2 , as the following figure illustrates. Clearly, L(M1+2 ) = L( r1+ r2).
start

new start

ε

M1

L(M1) = L( r1) L(M2) = L(r2)

ε (a) M1+2
start

M2

L(M1+2 ) = L( r1+ r2)

14

Proof (2.1): Re ⇒ FA

Hierarchy

To construct the FA M12 , we first convert all accepting states in M1 to nonaccepting states. Then from each of these non-accepting states, we set up an ε transition to the start state of M2 . Clearly, L(M12 ) = L(r1r2). L(M1) = L( r1) L(M2) = L(r2) ε start start M2 M1 (b) M12 L(M12 ) = L(r1r2)

Assumptions

Break Time

A car was involved in an accident. As one might expect, a large crowd gathered. A newspaper reporter, anxious to get is story, pushed and struggled to get near the car. Being a clever sort, he started shouting loudly, “Let me through! Let me through please! I am the son of the victim.” The crowd made the way for him. Lying in front of the car was a donkey. - Anonymous -

15

Proof (2.1): Re ⇒ FA

Hierarchy

Now, we construct an FA M1* to recognize the language expressed by r1*. First introduce a new accepting start state, and then set up an ε -transition, respectively, from this start state to the old start state and from every accepting state to the new start state (see the illustration below). Notice that we need the new start state defined as an accepting state to let the FA accept ε which is in the language expressed by r1*. ε new start ε
start start

L(M1) = L( r1) M1 L(M1*) = L(r1*)

(c) M1*

16

Proof (2.1): Re ⇒ FA

Hierarchy

Notice that when we construct M1* , if we use the old start state as illustrated in figure (a) below without introducing the new one, FA M1* may accept a string not in r1*. To see why, consider an FA whose start state is in a cycle as shown in figure (b) below. Since string ab is not accepted by M1, it should not be in r1*. However, the FA in figure (c) shows that the FA accepts ab. ε b
start

M1 (a)

a b start (b) a

b a b start

ε a (c)

start

17

Proof(2.1): Re ⇒ FA

Hierarchy

Example: Constructing an FA for a given regular expression.
Based on the approach given above, we show a step-wise construction of an FA which recognizes the language expressed by the regular expression ((ab + ε )ba)*.

a
start

a

b
start

b

ε
start

ab
start

a

ε

b

ab + ε
start

ε ε

a

ε

b

ba
start

b ε b

ε ε

a a

(ab + ε )ba ((ab + ε )ba)*
start

ε
start

a

ε

b ε

A N I

ε ε ε ε ε a ε

b ε

ε

b

ε

a

18

Proof (2.2): FA ⇒ Re

Hierarchy

Let M be an FA. We will show a method systematically transforming the state transition graph of M to a regular expression which expresses L(M). We first transform all the edge labels (i.e., the input symbols) in the transition graph into a regular expression (see figure (b) below). Now, we interpret the transition graph as follows: If there is an edge from a state p to a state q labeled with a regular expression r, then it implies that M, in state p reading any string in L(r), enters state q. By extending the function δ , we let it denote the above observation by δ (p, r) = q. Clearly, labeling the edges with regular expressions this way does not affect the language accepted by M. a, b a start 1 b
3 2

a+b a, b, c
5 4

a start
1

2 3

a+b+c
5 4

ε (a)

a

b

ε (b)

a

19

Proof (2.2): FA ⇒ Re

Hierarchy

Now, let G be the state transition graph with edges labeled with a regular expression. We eliminate any state (except for the start state and the accepting states) from G and manipulate the edges and their labels without affecting the language recognize by the automaton. The following example shows how. a+b Eliminate state 2 a(a+b)*(a+b+c) a+b+c a
1 2 5 4 1 5

b

3

a

b Eliminate states 3 and 4 Merge edges
5 1

a
*

3

a

4

a

A N I
5

1

a(a+b) (a+b+c)+baa

a(a+b)*(a+b+c)

baa Clearly, the label a(a+b)*(a+b+c)+baa on the edge from the start state to accepting state 5 is a regular expression which denotes the set of strings accepted by state 5 of the automaton.
20

Proof (2.2): FA ⇒ Re

Hierarchy

The following figures show a typical case of eliminating a state from a state transition graph and manipulating the edge labels with a regular expression. Clearly, the same idea works even when a complex regular expression is substituted for each simple regular expression in the graph. f a c af*b df*c

af*c

... r
b

q
d

s ...

... r
df*b

s ...

If the state transition graph has k (≥ 1) accepting states, the language L(M) is the union of all the languages accepted by these accepting states. Suppose that M has k accepting states and for each i, we found a regular expression ri that denotes the language accepted by i-th accepting state, then the regular expression r denoting the language accepted by M is given as follows. r = r1 + r2 + . . . . + rk
21

Proof (2.2): FA ⇒ Re

Hierarchy

Example: We compute a regular expression which denotes the language accepted by the FA shown in figure (a) below. We will compute regular expressions r4 and r0 that denote the language accepted by the accepting states 4 and 0, respectively, as they are illustrated by figures (b) and (c) below, and then find the answer r = r4 + r0. r4 a b
start 0

b
1 2

0

4

a

a b
3 4

b ε a (a)

ε b
0

(b) r = r 0 + r4 r0

b

(c)
22

Proof (2.2): FA ⇒ Re

Hierarchy

To compute r4, we change state 0, the start state, to non-accepting state, and leaving the start state and the accepting state 4, eliminating all the other states one by one. We begin by eliminating state 2. The order of elimination does not matter. Although the regular expression may differ, it expresses the same language. It is more convenient to eliminate a state that involves fewer transition edges. a b
start 0

b
1 2

Eliminate state 2 b ε
start 4 0

a a
1

ba b ba b

a

A N I

a b
3

b ε a

b ε a
3

b

4

b

b

b

23

Proof (2.2): FA ⇒ Re

Hierarchy

Parallel transition edges (i.e., edges having the same origin and destination) are merged into one edge with all the labels merged into one using the operator +. In the figures below, notice how the two transitions from state 1 to state 4 and the two loop transitions on state 1 are, respectively, merged into one. a+ba a
0 1

a b
start

ba
1

Merge parallel edges b ba b
4

a
0

b
start

b+ε ba b
4

b ε a
3

b a

b

b

b

3

b

24

Proof (2.2): FA ⇒ Re a+ba b
start

Hierarchy Eliminate state 3 a+ba a
0 1

a
0

1

b+ε ba b

b
start

b+ε bba ba bb
4

b a

b

4

b a+ba

b

3

b

b
start

a
0

1

b+ε bba ba
4

Merge parallel edges b+bb

b

A N I

25

Proof (2.2): FA ⇒ Re a+ba b
start

Hierarchy

a
0

1

b+ε bba ba
4

b

b+bb

A N I
Eliminate state 1

a(a+ba)*b b
start

a(a+ba)*(b+ε ) bba(a+ba)*b ba

b+bb
4

0

bba(a+ba)*(b+ε )

26

Proof (2.2): FA ⇒ Re a(a+ba)*b b
start

Hierarchy

a(a+ba)*(b+ε ) bba(a+ba)*b ba

b+bb
4

0

A N I
Merge parallel edges

bba(a+ba)*(b+ε )

a(ba+a)*b+b
0

a(a+ba)*(b+ε )

start

4

bba(a+ba)*b+ba bba(ba+a)*(b+ε )+ (b+bb)
27

Proof (2.2): FA ⇒ Re

Hierarchy

In general, with all the states eliminated except for the start state and one accepting state, we get a transition graph, as shown in figure (i) below. We need one more step.

r11 r12
start

r22 2 1

r21 (r11 )*r12 (r11 )*r12

r22

1 (i) r21

2 (ii)

r22 + r21 (r11 )*r12

A N I

1

(r11 )*r12 (iii)

2

r2 = (r11 )*r12 (r22 + r21 (r11 )*r12 )* (iv)
28

Proof (2.2): FA ⇒ Re

Hierarchy

Now, we go back to figure (g) (repeated below) in the series of state eliminations, and we apply the idea we have just developed. Using the short notation rij for the label on the edge linking state i to j, the language accepted by state 4 can be expressed by the regular expression r4 shown below. r00 a(ba+a)*b+b
0

r04 a(a+ba)*(b+ε ) r4 = (r00 )*r04 ( r44 + r40 (r00 )*r04 )*
4

start

bba(a+ba)*b+ba r40 bba(ba+a)*(b+ε )+ (b+bb) r44

29

Proof (2.2): FA ⇒ Re

Hierarchy

To compute a regular expression r0 which denotes the language accepted by state 0 (the start state) of the FA, we can use the same reduced graph (repeated below). To compute a regular expression that denotes the language accepted by the start state, we change the start state back to accepting state and state 4 to non-accepting state as show below. a(ba+a)*b+b
0

a(a+ba)*(b+ε )
4

a(ba+a)*b+b
0

a(a+ba)*(b+ε )
4

start

bba(a+ba)*b+ba bba(ba+a)*(b+ε )+ (b+bb)

start

bba(a+ba)*b+ba bba(ba+a)*(b+ε )+ (b+bb)

30

Proof (2.2): FA ⇒ Re

Hierarchy

By eliminating state 4 and merging the resulting two self-loops on state 0 as shown, we finally get a regular expression r0. (Notice that we are using the short notation for convenience.) r00 a(ba+a)*b+b
0

r04 a(a+ba)*(b+ε )
4

r00 r04 (r44 )*r40
start 0

start

bba(a+ba)*b+ba r40 bba(ba+a)*(b+ε )+ (b+bb) r44 r0 = (r00 + r04 (r44 )*r40 )*
start 0

(i) r00 + r04 (r44 )*r40

(h)

A N I

(k)

Finally, substituting back the entire short notation in r0 + r4, we will get a regular expression r which denotes the language recognized by the given FA.
31

Rumination (1): FA ⇒ Re • The state elimination technique would be easy to practice for a simple graph with a paper-and-pencil.

Hierarchy

However, it will be messy for a large graph. There is a beautiful algorithmic approach, called CYK algorithm, developed based on the dynamic programming. This algorithm is presented in Appendix C. • Depending on the order of the states eliminated in the procedure, we will get a different regular expression. However, they are equivalent, i.e., they denote the same language recognized by the FA. • It is an intractable (i.e., solvable, but no practical algorithm available) problem to tell whether two arbitrary regular expressions are equivalent or not. However, for some particular cases, especially if the expressions are simple, we can solve the problem using the techniques available in this text. The figures below prove the two equivalences (1) and (2) below. Notice that the same equivalence holds even when an arbitrary regular expression is substituted for a or b in the expressions. (In the following chapter, we will learn how to convert an NFA to a DFA and a method to minimize the number of states of a DFA.) (1) a* = (a* )* a (a* )* 1 Convert to an FA a 1 ε ε ε ε 2 a 1 a a (2) (a + b)* = (a*b*)* a 2 Convert to a DFA a, b a,b 1 a, b a, b 2 1 1 Minimize the number of states a, b a, b {1,2} Regular expression a, b (a + b)* a a a {1,2} a*

A N I

Eliminate ε transitions b 2

(a*b*)*

32

Exercises

Hierarchy

7.1 Using the technique presented in Section 7.2, transform the following state transition graph of an FA into a regular grammar which generates the same language recognized by the FA. a a,b b b ε ε a,b 7.2 Using the technique presented in Section 7.2, transform the following regular grammar into the state transition graph of an FA which recognizes the same language generated by the grammar. S → abS | cC A → aS | B | a B → aA | ε C → aB | abc b ε

start

7.3 Compute a regular expression that denotes the language recognized by the FA shown in problem 7.1 above. You should also show the procedure that you took to get your answer. 7.4 Let L be the language denoted by the following regular expression. (a) Construct the state transition graph of an FA that recognizes L, and (b) Construct a regular grammar that generates L. You should also show the procedure that you took to find each answer. ((ba + b)* + (cd(a + b))*)bba

33

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.