Chapter 02. Languages

Models of Language Generation:
Grammars
1
Chapter 2. Formal Languages and Grammars
In chapter 1 we learned how to represent a set. If the set is small, we write it out. Otherwise, we show it in terms of a set
property representing the characteristics of the set elements. This approach is useful for the sets whose members (i.e., the
strings) have simple properties to describe. For the formal languages in real applications, like programming languages, it is
impractical to represent a language in terms of the set properties because in general the properties are too complex to
identify and succinctly describe. Hence the major part of the book will be dedicated to the languages represented as a set of
rewriting rules, called grammar, which derives the language.
In this chapter we will learn how to derive a language with a given set of rewriting rules, and conversely, given a
language, how to find a set of rewriting rules that derive it. We begin with a couple of rewriting rules to see how they derive
a language.
2.1 Languages 31
2.2 Deriving a Language with rewriting rules: Examples 33
2.3 Definitions: Formal languages and grammars 48
Type 0 (phrase structured) grammars
Type 1 (context-sensitive) grammars
Type 2 (context-free) grammars
Type 3 (regular) grammars
2.4 Grammars and their languages: Examples 53
Rumination 55
Exercises 58
2
Languages and Grammars
2.1 Languages
Webster’s dictionary defines the word language as follows.
(1) The words, their pronunciation, and the methods of combining them used
and understood by a community.

(2) A system of signs and symbols and rules for using them that is used to
carry information.
In the above definitions we see two common factors: (a) language elements such
as words, their pronunciation, signs and symbols, and (b) methods of combining (or
rules for using) them. In natural languages the elements and the rules of using them
are evolved, not defined. Each sentence is constructed with some of the language
elements combined according to the rules of the language. Since the language
elements and the rules for using them give rise to all the sentences, we can simply
define a natural language as a set of sentences.
In formal languages we usually deal with strings of symbols (not pronunciation)
and the rules for constructing the strings, which correspond to the sentences of a
natural language. Hence, in formal languages we simply define the language as
follows.
3
Languages Languages and Grammars
According to the above definition, the following sets are languages. Notice that
set A is finite and others are infinite. Set D is the C++ programming language.
A = { aaba, bba, aaabbb }
B = {xxR | x ∈ {0, 1}+ } = {00, 11, 0110, 1001, 1111, . .}
C = {anbncn | n ≥ 0} = {ε , abc, aabbcc, aaabbbccc, . . . }
D = {x | x is a C++ program }
Love
Break Time
Never pretend to a love which you do not actually feel, for love is not ours to command. - Alan Watts –
We can only learn to love by loving. - Iris Murdoch –
Love is the triumph of imagination over intelligence. - H. L. Mencken –
There is no remedy for love but to love more. - H. D. Thoreau -
4
2.2 Deriving a language with rules
The following example shows a set of rewriting rules named spring-grammar.
spring-grammar:
<spring-sentence> → <spring-subject> <spring-verb>
<spring-subject> → birds | flowers | buds
<spring-verb> → sing | bloom | fly
A rewriting rule (or simply rule) is divided into two sides; left side and right side
separated by an arrow. When there is more than one element (word) at the right side,
we separate them with a vertical bar (|). In the above example, every rule has one
element at its left side. There can be more, as we shall see later. We will call each rule
by the word on its left side.
In the example, the elements delimited by a pair of angled brackets < . . . > are only
used for generating a “spring sentence” consisting of “spring subjects,” birds, flowers
and buds, followed by “spring verbs,” sing, bloom and fly.
5
Deriving a language with rules Languages and Grammars
spring-grammar:
Let’s see how we can derive a “spring sentence” with the above set of rules. We
begin with a designated element, in this example <spring-sentence>, and rewrite it
with one of the right side elements of its rule. The choice is arbitrary, if there is
more than one element. In the example we have no choice except <spring-subject>
<spring-verb>. So, we get the following.
A
<spring-sentence> ⇒ <spring-subject> <spring-verb> N
I
6
spring-grammar:
We call the result of a rewriting step a sentential form. From the current sentential
form, we pick an element that appears in the left side of a rule and rewrite it with one
of its right side elements. We keep this rewriting step until we get a sentential form
with no rule applicable as shown below.
This last sentential form (“birds sing” in the example) is a member (string or
sentence) of the language generated by the set of rules.
A
<spring-sentence> ⇒ <spring-subject> <spring-verb> N
I
⇒ birds <spring-verb> ⇒ birds sing
7
spring-grammar:
<spring-sentence> ⇒ <spring-subject> <spring-verb>

⇒ birds <spring-verb> ⇒ birds sing.
In the above derivation if we had rewritten <spring-subject> with flowers instead of
birds, we would get a spring sentence “flowers sing.” It follows that the set of rules in
the spring-grammar can derive all the 9 sentences that can be formed with the three
subjects and three verbs. Denoting this set of sentences (i.e., the language) by
L(spring-grammar), we get
L(spring-grammar) = { birds sing, birds bloom, birds fly, flowers sing,
flowers bloom, flowers fly, buds sing, buds bloom, buds fly }
8
If we are not concerned with the meaning of the sentences (i.e., strings), we can
simplify the grammar by substituting a single symbol for each of the elements as
shown below. (Studying formal languages, we are mainly concerned with the
structural complexity of the languages, not the meaning.)
In this simplification, we used lower case letters for the words that appear in the
language, and upper case letters for the others. In particular, we used S to denote the
starting element for the language derivation. (Notice that in the example, # denotes
the blank between <spring-subject> and <spring-verb>.)
G
spring-grammar:
G: S → A#B
S A # B
<spring-sentence> → <spring-subject> <spring-verve> A → a|b|c
A a b c B → d|e|f
L(G) = {a#d, a#e, a#f, b#d,
B d e f
<spring-verve> → sing | bloom | fly b#e, b#f, c#d, c#e, c#f}
9
Conventionally, we use lower case letters and special characters to denote a word in
the language, and upper case letters only for the elements used for language
generation. We call those lower case letters terminal symbols, and the upper case
letters nonterminal symbols. In our example (repeated below), a through f and # are
terminal symbols, and S, A and B are nonterminal symbols. We call the set of
rewriting rules a grammar of the language.
G: S → A#B
A → a|b|c L(G) = { a#d, a#e, a#f, b#d, b#e, b#f, c#d, c#e,
c#f }
B → d|e|f
Break Time
Life
- Life is a long lesson in humility. - J. M. Barrie –
- Life is far too important a thing ever to talk seriously about. - Oscar Wilde –
- Life is like playing a violin in public and learning the instrument as one goes on. - Samuel Butler –
- Life is a foreign language; all men mispronounce it. - Christopher Morley –
10
The size of the language of the above grammar is finite. The following grammar
shows how it is possible to generate an infinite language.
Example 1. G1: S → dingdongA A → dengS | deng
S → dingdongA A → dengS A
S dingdongA dingdongdengS N
I
S → dingdongA A → deng
dingdongdengdingdongA
dingdongdengdingdongdeng
In the above derivation, rule A → dengS allows us to derive a string containing

“dingdongdeng” repeated arbitrary number of times. To terminate the derivation
we should finally apply rule A → deng.
11
Example 1. G1: S → dingdongA A → dengS | deng
We can see that the language L(G1) is
L(G1) = { (dingdongdeng)i | i ≥ 1 }
Here are more examples.
Example 2. G2: S → aaS | aa
S → aa S → aaS S → aa A
S ⇒ aa S ⇒ aaS ⇒ aaaa N
I
S → aaS S → aaS S → aa
S ⇒ aaS ⇒ aaaaS ⇒ aaaaaa ..........
L(G2) = {a2i | i > 0}
12
Example 3. G3: S → aSb | ab
S → ab
⇒ ab L(G3) = {aibi | i ≥ 1}
S
S → aSb S → ab
⇒ aSb ⇒ aabb
A
S → aSb S → ab N
⇒ aaSbb ⇒ aaabbb I
. . . .⇒ aaaabbbb
. . . . . .⇒ aaaaabbbbb
.......
13
The following grammar looks challenging. The number of rules and their
complexity, with more than one symbol on the left side, are overwhelming. However,
once you understand how it derives a couple of strings, you will like it. Notice that the
grammar has only one terminal symbol a.
Example G4: S → ACaB | a Ca → aaC CB → DB | E

4. aD → Da AD → AC aE → Ea AE → ε
Using rule S → a, the grammar generates string a, which is a member of the language.
Let’s see what other members it can generate starting with rule S → ACaB.
14
For the convenience, the grammar is repeated below with id number assigned to
each rule.
(1) (2) (3 (4) (5)
S → ACaB | a Ca → aaC
) CB → DB | E
(6 (7 (8 (9
aD )→ Da AD →
) AC aE )→ Ea AE )→ ε
Applying rules (1), the only rule applicable next is rule (3). Notice that by rule
(3), C keeps jumping over a to the right adding one a till it meets B, where we
can apply either rule (4) or (5). If we apply rule (4), then by rule (6) D moves left
till it meets A and converts to C as follows. We will shortly see what will happen
if we apply rule (5). A
(1) (3) (4) (6) (6) (7)
N
S ⇒ ACaB ⇒ AaaCB ⇒ AaaDB ⇒ AaDaB ⇒ ADaaB ⇒ ACaaB I
Again by keep applying rule (3) till C meets B, we can double the number of a.
(3) (3) (3)
ACa . . . . aB ⇒ AaaCa . . . . aB ⇒ . . . . ⇒ Aaa . . . . aaCB
15
(1) (2) (3) (4) (5)

S → ACaB | a Ca → aaC CB → DB | E
(6) (7) (8) (9)
aD → Da AD → AC aE → Ea AE → ε
(3) (3) (3)

ACa . . . . aB ⇒ AaaC . . . . aB ⇒ . . . ⇒ Aaa . . . . aaCB
If we want to derive a terminal string, i.e., a member of the language, we should

apply rule (5) (i.e., CB → E) and then, by applying rule (8) repeatedly, bring E
toward the left end until it meets A. Finally applying rule (9) we get a string of
terminal symbol a.
A
(5) (8) (8) N
Aaa . . . . aaCB ⇒ Aaa . . . . aaE ⇒ Aaa . . . . aEa ⇒ . . . I
(8) (9)
⇒ AEaa . . . . aa ⇒ aa . . . . aa
16
(1) (2) (3) (4) (5)

(6) (7) (8) (9)
(3) (3) (3)

ACa . . . . aB ⇒ AaaC . . . . aB ⇒ . . . ⇒ Aaa . . . . aaCB
If we want to double the number of a again in the last sentential form above, we
apply rule (4), bring D toward the left end using rule (6) until it meets A and then
change D to C by rule (7) as follows.
(4) (6) (6)

Aaa . . . . aaCB ⇒ Aaa . . . . aaDB ⇒ Aaa . . . . aDaB ⇒ . . .
(6) (7)
⇒ ADaa . . . . aaB ⇒ ACaa . . . . aaB
Now, by rule (3) again we are ready to double the number of a in the above final
sentential form.
17
(1) (2) (3) (4) (5)

(6) (7) (8) (9)
Now, we are ready to figure out the language of the grammar. By rule (2) (S →
a) the string a can be derived. By applying rules (1), (3), (5), (8), (8) and (9) in this
order, we can derive string aa as follows.
A
(1) (3) (5) (8) (8) (9) N
S ⇒ ACaB ⇒ AaaCB ⇒ AaaE ⇒ AaEa ⇒ AEaa ⇒ aa I
We know that instead of rule (5) in the above derivation, if we apply rule (4)
followed by the rules (6), (6) and (7), we get the sentential form ACaaB. With
this sentential form we are ready to derive the string aaaa.
18
(1) (2) (3) (4) (5)

G4 : S → ACaB | a Ca → aaC CB → DB | E
(6) (7) (8) (9)
Now, we claim that the language of the grammar G4 is the set of strings of a whose
length is some power of 2, i.e.,
L(G4) = { am | m = 2n, n ≥ 0 }
We can prove this claim by induction. We have just shown that the basis part is
true. We leave the formal proof for the reader.
19
2.3 Definitions: Formal Languages and Grammars
To define a grammar we need the following information.

(1) The language alphabet, i.e., the set of terminal symbols, which appear in the
language. Conventionally, we use the lower case English alphabet together with
special characters if needed.
(2) The set of nonterminal symbols (the upper case letters by convention) used to
construct the rules to derive the language.
(3) The set of rules.
(4) The start symbol (S by convention).
In the previous examples we saw various kinds of rules. The rules in the first three
examples have only one symbol on their left side. In general there can be a string of
finite length. We will later see that such variations may affect the characteristics of
the language generated by the grammar.
In this book we will study the following four types of grammars that have been
extensively investigated.
20
Definition Languages and Grammars
Type 0 ( phrase structured ) grammar: a type 0 grammar G is the

following quadruple,
G = < VT , VN , P , S > , where
• VT : the terminal alphabet (called morphemes by linguists). Following the

convention, we will use the lower case letters.
• VN : the nonterminal alphabet (also called variables or syntactic
categories). Again, following the convention, we will use the upper case
letters.
• V = VT ∪ VN , i.e., the set of symbols used in the grammar. This set is
called the total alphabet.
• S ∈ VN : the start symbol.
• P : a finite set of rules, each denoted by α → β and read “α generates
β ,” where α ∈ V*VNV*, β ∈ V*.
21
Notice that in a rule α → β , the left side must be a string α ∈ V*VNV*

and the right side must be a string β ∈ V*. More specifically, α is a string in
V* with at least one symbol in VN (i.e., a nonterminal symbol), and β can
be any string in V*, including the null string ε.
For two strings w1 and w2, by w1 ⇒ w2 we mean that w1 derives w2 by
applying a rule. We write w1 ⇒* w2 , if w1 derives w2 by applying an arbitrary
number (i.e., zero or more) of rules. Any string w ∈ V* that can be derived
starting with the start symbol (i.e., S ⇒* w) is called a sentential form.
The language of a grammar G, denoted by L(G), is the set of all string of
terminal symbols that can be derived by some rules of G. That is
L(G) = { x | x ∈ (VT )* and S *⇒ x }
All the grammars in Examples 1 – 4 are type 0. The languages of type 0

grammars are also called recursively enumerable (R.E.) languages.
22
The following three types of grammars, type 1, 2 and 3 are defined with some
additional restrictions on the rule form of type 0. However, we will see that since
those restrictions do not violate type 0 forms, all these restricted grammar types are
also type 0.
Type 1 (context-sensitive) grammars: This type of grammars have rules α → β ,
where α ∈ V*VNV*and β ∈ V*, with the restriction that |α | ≤ |β |, i.e., the left side
of all the rules in the grammar cannot be longer than its right side, except for the rule
S → ε under the condition that no rule in the grammar has S on its right side. (Notice
that the compositions of α and β are the same as those in type 0 rules.)
Type 2 (context-free) grammars: These grammars have rules α → β , where α ∈
V*VNV*, β ∈ V*, with the restriction that |α | = 1. Since α should have at least one
nonterminal symbol, this restriction implies that the left side must have one
nonterminal symbol (i.e., an upper case letter).
23
Type 3 (regular) grammars: these grammars have type 2 grammar rules with
the following restrictions on their right side:
• the right side of each rule can have no more than one nonterminal symbol.
• The nonterminal symbol at the right side of a rule, if any, should be at the
end of the string. That is, in rule α → β it is required that β = xA or β =
x, where A ∈ VN and x ∈ (VT )*.
For example, A → abB, B → aa | D, D → ε are all legal type 3 grammar
rules. However, rules A → aBa and B → aaAD cannot be in a type 3
grammar, because in the former nonterminal symbol B is not at the end of
right side string, and in the latter there are more than one nonterminal symbol.
24
2.4 Grammars and their language: Examples
Here is a typical example for each of the four types of grammars and their
languages that we will often use for further discussions in the following chapters.
Type 0 (phrase structured): G = < {a}, {S,A,B,C,D,E}, P, S >

P = { S → ACaB | a Ca → aaC CB → DB | E
aD → Da AD → AC aE → Ea AE → ε }
L(G) = { am | m = 2n, n ≥ 0 }
The above grammar is the one that we examined in Section 2.1. If you understand
how this grammar generates its language, it would not be hard to see how the
following type 1 grammar generates its language shown below.
Type 1 (context-sensitive): G = < {a,b,c}, {S,B,C}, P, S >

P = { S → aSBC | aBC CB → BC bB → bb
aB → ab bC → bc cC → cc }
L(G) = {aibici | i ≥ 1 }
25
Grammars: Examples Languages and Grammars
Type 2 (context-free): G = < {0,1}, {S,A,B}, P, S >

P = { S → ASB | ε A →0 B →1 }
L(G) = {0i1i | i ≥ 0 }
Type 3 (regular): G = < {0,1}, {S,A}, P, S >

P = { S → 0S | A A → 1A | ε }
L(G) = { 0i1j | i, j ≥ 0 }
From now on, to present a grammar we shall only show the set of rewriting (or
production) rules written according to the convention, i.e., lower case letters for
terminal symbols, upper case letters for nonterminal symbols, and S for the start
symbol.
26
Rumination (1): Grammars and production rules
The following remarks summarize subtle conceptual aspects concerning formal grammars and their languages that we
have defined. Let G = < VT , VN , P , S > be a grammar.
(a) Recall that the language L(G) is the set of terminal strings that can be generated by applying a finite sequence of
production rules. There is no order in the grammar rules that must be observed when a string is derived. However,
depending on the order of the rules applied, we may end up with a string containing a nonterminal symbol from which no
terminal (or null) string can be derivable. For example, consider the following type 1 grammar with four rules.
(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc
Clearly, only rules (1) (2) (4) applied in this order will derive terminal string abc, which is the only member of the
language of the grammar. If you apply rule (1) followed by rule (3), you will be stuck with Abc as follows, which cannot be
a member of the language because the string has a nonterminal symbol.
S → ABC BC → bc
S ⇒ ABC ⇒ Abc ⇒ ??
(b) Rule (3) of the grammar above is useless in the sense that it does not contribute to the generation of the language. We
can delete this rule from the grammar without affecting its language. In general, the decision problem of whether a rule in
an arbitrary grammar is useless or not is unsolvable. However, if we restrict the problem to the class of context-free
grammars (type 2), we can effectively clean up useless rules, if any. We will learn such an algorithm in Chapter 9.
(c) In this chapter we defined a type 0 (or phrase structured) grammar as a grammar having production rules of the form
α → β , where α ∈ V*VNV*, β ∈ V*. The left side of each rule should have at least one nonterminal symbol. In the
text we see some variations of this definition, in particular on the left side of the rules. For example, unrestricted
grammars are defined as having rules with α ∈ V+ and type 0 grammas with α ∈ (VN)+. These variations are equivalent
in the sense that they generate the same class of languages as the phrase structured grammars. In this text we will use the
generic names and type numbers to refer the grammars and their languages defined in this chapter.
27
Rumination (1): Grammars and Rules Languages and Grammars
(d) The grammars that we have defined are sequential in the sense that only one rule is allowed to apply at a time. Notice that
with the grammar below, if we apply rules (2) AB → ab and (3) BC → bc simultaneously on the string ABC, which is derived
with rule (1), we will get terminal string abbc, which is not a member of the language according to our definition. There is a
class of grammar where more than one rule can be applied simultaneously. We call such rules parallel rewriting rules. (In
Chapter 3, we will see a class of grammars using parallel rewriting rules.) In general it is not easy to figure out the language of
a parallel rewriting grammar.
(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc
(e) For context-free grammars we get the same language independent of the mode of rule application, either sequential or
parallel. Why? The answer is left for the reader.
(f) A context-sensitive grammar is defined as a phrase structured grammar with non-contracting rules, except for S → ε under
the condition that no rule has S on its right side. On account of the non-contracting rules, the sentential forms that we get along
a derivation never shrink, which is a typical property of context-sensitive grammars. However, we need the exceptional case, S
→ ε , which is contracting, to include the null string in the language. Including this rule in the grammar, we need the condition
of not allowing S on the right side of any rule because, otherwise (e.g., S → SSS | ε), the non-contracting property of
derivations will be violated. Sometimes in a text we see context-sensitive grammars defined without this exceptional case, thus,
excluding ε from the language.
(g) A context-free grammars are defined as a type 0 (not type 1) grammar with the restriction of |α | = 1. It follows that a
context-free grammar can have a contracting rule, like A → ε (called ε-production rule), while type 1 grammars are not
allowed to have such rules except for S → ε. Later we will learn in Chapter 10 that all ε-production rules in a context-free
grammar, if any, can be eliminated, leaving S → ε only when the null string is in the language.
28
Rumination (1): Grammars and Rules Languages and Grammars
(h) A context-free grammar is called linear if each rule is restricted to the following form.
A → xBy or A → x, where A, B∈ VN, x, y ∈ (VT)*
Recall that each rule of a regular grammar is restricted to the form of A → xB or A → x. Such grammars are also called right
linear. By this definition, neither of the following rules cannot be a regular grammar rule.
A → bBC A → abBa A → Ba
(i) Following the same notion, a left linear grammar can be defined as a context-free grammar having its rules restricted to the form
of A → Bx or A → x. We will show in Chapter 9 that left linear grammars and right linear grammars are equivalent in the sense
that they both generate the same class of regular languages.
(j) Notice that a regular grammar cannot have both right linear and left linear rules. For example, the following linear grammar
generates the language L = {aibi | i ≥ 1}. In Chapter 12 we will show that no regular grammar can generate L.
S → aB B → Sb | b
(k) Formal grammars were defined in terms of the quadruple G = < VT, VN, P, S >, which shows all the necessary information that
are required to specify a grammar. This is very general way of defining a system not only in computer science but also in other
fields. We commonly define a system by first showing the list of the system’s entities that are required for the definition, and
then specifying each entity. We will follow the same approach when we define automata in Chapter 4.
Defining a system S:
(1) S = < E1, E2, . . ., Ek > // a list of system entities
(2) Specify each of entities E1, E2, . . ., Ek.
29
Exercises
2.1 For each of the grammars G1 – G4 below, answer the following questions.
(a) Which type is the grammar (either phrase structures, context-sensitive, context-free, or regular)? (Recall that a grammar
can be in more than one type.)
(b) What is the language?
G1: S → aS | abcB B → bbB | ε
G2: S → ABC | ε AB → ab bC → bc BC → cd Ac → bc
G3: S → AAA A →a | ε
G4: S → ABC AB → a aC → c BC → b
2.2 Using the set property notation, show the language of each of the following grammars. (Review Example 4 in Section 2.1
before you figure out the answers for parts (g) and (h).)
(a) S → aS | bS | ε (b) S → aS | Saa | ε

(c) S → aSa | bSb | aa | bb (d) S → aA | bB A → Sa |a B → Sb | b
(e) S → aA | ε A → Sb (f) S → aSb | A A → aA | a
(g) S → DG G → bAGc | bEc bA → Ab bE → Eb DA → aaD DE → aa
(h) S → DG G → bbAGcc | bbEcc bA → Ab bE → Eb DA → aaD DE → aa
30
Exercises Formal Languages
2.3 For each of the following languages, show a regular (type 3) grammar that generates it.
(a) L1 = {a}* (b) L2 = {ab, abb, abbb} (c) L3 = { axb | x {0, 1}*}
(d) L4 = L2 ∪ L3 (e) L4 = L2L3 (f) L4 = { xy | x ∈{0, 1}+, y ∈{a, b}+}
(g) L3 = { x | x is a decimal number. i.e., x∈{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+ }
(h) L5 = { x | x ∈ {a, b}* and x has at least one a. }
2.4 For each of the following languages, construct a context-free (type 2) grammar that generates the language and briefly
explain how your grammar generates it.
(a) L1 = {aibj | i > j > 0} (b) L2 = {aibj | i ≥ j > 0} (c) L3 = {aibj | i ≥ j ≥ 0}

(d) L4 = {aibj | j > i > 0} (e) L5 = {aibj | i, j > 0 and i ≠ j } (f) L6 = {xxR | x ∈ {a, b}*}
2.5 For each of the following languages, construct a grammar (of any type) that generates the language, and briefly explain how
your grammar generates it. (Hint: try to apply the idea of the type 1 grammar given as an example in Section 2.3. )
(a) L1 = {a2n bnc2n dm | n,m ≥ 1 } (b) L2 = {anbncndn | n ≥ 1 }
31

Chapter 02. Languages

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 02. Languages

Uploaded by

Copyright:

Available Formats

Models of Language Generation:

and understood by a community.

We can only learn to love by loving. - Iris Murdoch –

Love is the triumph of imagination over intelligence. - H. L. Mencken –

There is no remedy for love but to love more. - H. D. Thoreau -

<spring-sentence> ⇒ <spring-subject> <spring-verb>

L(spring-grammar) = { birds sing, birds bloom, birds fly, flowers sing,

Example 1. G1: S → dingdongA A → dengS | deng

In the above derivation, rule A → dengS allows us to derive a string containing

Example 1. G1: S → dingdongA A → dengS | deng

We can see that the language L(G1) is

Here are more examples.

Example 2. G2: S → aaS | aa

L(G2) = {a2i | i > 0}

Example 3. G3: S → aSb | ab

Example G4: S → ACaB | a Ca → aaC CB → DB | E

(1) (2) (3) (4) (5)

(3) (3) (3)

If we want to derive a terminal string, i.e., a member of the language, we should

(1) (2) (3) (4) (5)

(3) (3) (3)

(4) (6) (6)

(1) (2) (3) (4) (5)

(1) (2) (3) (4) (5)

2.3 Definitions: Formal Languages and Grammars

To define a grammar we need the following information.

Type 0 ( phrase structured ) grammar: a type 0 grammar G is the

G = < VT , VN , P , S > , where

• VT : the terminal alphabet (called morphemes by linguists). Following the

Notice that in a rule α → β , the left side must be a string α ∈ V*VNV*

L(G) = { x | x ∈ (VT )* and S *⇒ x }

All the grammars in Examples 1 – 4 are type 0. The languages of type 0

2.4 Grammars and their language: Examples

Type 0 (phrase structured): G = < {a}, {S,A,B,C,D,E}, P, S >

Type 1 (context-sensitive): G = < {a,b,c}, {S,B,C}, P, S >

Type 2 (context-free): G = < {0,1}, {S,A,B}, P, S >

Type 3 (regular): G = < {0,1}, {S,A}, P, S >

(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc

(1) S → ABC (2) AB → ab (3) BC → bc (4) bC → bc

(b) What is the language?

G1: S → aS | abcB B → bbB | ε

(a) S → aS | bS | ε (b) S → aS | Saa | ε

(a) L1 = {aibj | i > j > 0} (b) L2 = {aibj | i ≥ j > 0} (c) L3 = {aibj | i ≥ j ≥ 0}

(a) L1 = {a2n bnc2n dm | n,m ≥ 1 } (b) L2 = {anbncndn | n ≥ 1 }

You might also like

Notice that in a rule α → β , the left side must be a string α ∈ VVNV