This action might not be possible to undo. Are you sure you want to continue?

1

**Chapter 3. Other Models for Language Representation
**

In Chapter 2 we learned how to define a language in terms of a grammar, and saw several examples. Numerous other models for language representation have been introduced in the literature, among others, graph grammars, picture language grammars, Lyndenmayer systems (L-systems), syntax flow graphs, regular expressions, etc. This chapter briefly introduces L-systems, syntax flow graphs and regular expressions, and points to their applications.

3.1 L-systems 62 Definitions and examples Application 3.2 Syntax flow graph 67 Definition and examples 3.3 Regular Expressions 70 Definition and examples Algebraic properties of regular expressions Rumination 74 Exercises 75

2

Other Language Models

3.1 L-systems

L-system is a grammar model introduced by Lyndenmayer to describe biological development, such as the growth of plants and cells. L-systems are different from the grammars that we have studied in the last chapter. In an L-system, to drive the next string from the current string all applicable rules are applied simultaneously (i.e., in parallel), and every string that can be derived belongs to the language of the system. There are several variations of L-systems. Zero-sided L-systems correspond to the context-free grammars in the sense that the production rules are not context dependent, i.e., there is only one symbol on the left side of each rule. Depending on the context-sensitivity of the rules to the left, to the right or both, the L-systems are called, respectively, left-sided, right-sided and two-sided L-systems. Having parallel rewriting rues, in general it is hard to indentify the language of an L-system. However, because of the unique characteristics of the languages generated by L-systems that can be used in computer imagery, L-systems have been widely used as a graphical tool in computer science. The following book contains a rich collection of various L-systems and their applications. H. Ehrig, M. Nagl, G. Rozenberg, and A. Rosenbeld (Edition), “Graph Grammars and Their Application to Computer Science,” (Lecture Notes in Computer Science #291, Springe-Verlag, 1986)

3

L-systems

Other Language Models

In an L-system there is no difference between terminal symbols and nonterminal symbols. Every string that can be derived by applying the rules belongs to the language. Definition: An L-system G is a triple G = (Σ , h, ω ), where Σ is a finite alphabet, h is a set of rewriting rules, and ω , called the axiom, is the start string. Conventionally the rules are expressed as a function h. For example, h(α ) = β and h(α ) = {β , γ }, where α , β , γ ∈ Σ *, imply, respectively, the rules α → β and α → β | γ in the formal grammar notation. Let hi(ω ) denote a string derivable by applying the rules in h i times, i.e., h0(ω ) = ω , h1(ω ) = h(ω ), h2(ω ) = h(h1(ω )), . . . ., hi(ω ) = h(hi-1 (ω )). The language of an L-system G = (Σ , h, ω ) is defined as follows. L(G) = { hi(ω ) | i ≥ 0}

4

L-systems

Other Language Models

Below are two simple L-systems. Notice that in both examples, there is only one symbol at the left side of the rules. That is, the rules are context-free. Such Lsystems are called 0L (zero-sided Lyndenmayer) systems. We can also define Lsystems with rules whose left side has a string of length greater than 1. Example 1. G1 = ( {a}, h, a2 ), h(a) = {a, a2 }. L(G1) = {an | n ≥ 2}. Example 2. G2 = ({a, b}, h, ab ), h(a) = aa, h(b) = ab. L(G2) = {amb | m = 2n-1, n ≥ 1}.

Break Time When one door of happiness closes, another opens. But often we look so long at the closed door that we don’t see the one which has been open for us. It’s true that we don’t know what we got until we lose it, but it’s also true that we don’t know what we’ve been missing until it arrives. - Anonymous -

5

L-system

Other Language Models

The following figures show how an L-system can be used to draw a plant, where the lines correspond to the branches of a tree, and the matching brackets indicate the start and the end points of a branch. Nested matching brackets indicate branching on their “main-branch” corresponding to the outer brackets. When we draw a tree in a two dimensional space with such an expression, we let the branches grow in alternate directions, to the right and then to the left, with certain angle, say 30o from the current direction.

[

]

[

]

[ [] [ ] ]

[

]

(a (b) (c) ) To draw a tree we first let the L-system generate a string of symbols (each corresponding to a line segment) with matching brackets, and then translate the string into a tree using the same idea shown above. Next slide shows an example, where every digit is translated into a line segment of the same length.

6

L-system Drawing a branching plant using an L-system

Other Language Models

**L-system rule (1 is the axiom) 1→ 23 4 → 25 7 →8 2 →2 5 → 65 8 → 9[3] 3 → 24 6 →7 9 →9
**

5 6 5

5 6 7 83

4 5 2

5 2 8 2 9 2 7 8

3

5 6 7 6 5

9 92

Generate a string

6 2 7 6 2 2

**1 23 3 224 .. Translate .. 2229[229[24]9[3]8765]9[229[3]8765]9[228765]9[22 8765]9[2265]9[225]9[24]9[3]8765
**

7 8 8 2

9 92 9 92 9

4

2 9

2

9

2

9 2 2 2

7

3.2 Syntax Flow Graph

Other Language Models

A syntax flow graph (also called syntax diagram) is a graphical representation of context-free grammar rules. (This model is applicable only for context-free grammar rules.) This model helps us understand how the grammar rules put symbols together to construct a string. The figure below shows an example. For each of the nonterminal symbols (S, A and B in the example below) in the grammar, we draw a directed graph under the title of that nonterminal with one entry and one exit edge as follows. For each right side of the rule corresponding to the nonterminal symbol, we draw a path from the entry to the exit with the sequence of nodes, each labeled with the symbol in the order appearing on the right side. Conventionally, terminal symbols are put in a circle and nonterminal symbols in a rectangle. The null string is represented with a line. S → aS | A S a

A

A → bAc | B A b B A c

S

B →d | f | ε B d f

8

Syntax Flow Graph

Other Language Models

Example 1. Here is a syntax flow graph (and the rewriting rules) corresponding to the identifier defined in C programming language.

<letter> a b

<digit> 0 1 <letter> → a | b | … | z <digit> → 0 | 1 | 2 . . . | 9

. .

. .

z <identifier> letter

9 digit letter

<identifier> → <letter> | <identifier><letter> | <identifier><digit>

9

Syntax Flow Graph

Other Language Models

Example 2. The following syntax flow graph (and the rewriting rules) represents the unsigned integers and unsigned doubles (i.e., real numbers) defined in C. <unsigned integer>

digit

**<unsigned integer> → <digit> | <unsigned integer><digit>
**

+ . unsigned integer E unsigned integer

<unsigned double>

unsigned integer

**<unsigned double> → <unsigned integer> . <unsigned integer> | →<unsigned integer> . <unsigned integer><exponent> |
**

→<unsigned integer><exponent> | →<unsigned integer> <exponent> → E<sign><unsigned integer> <sign> → + | - | ε

10

Other Language Models

3.3 Regular Expressions

Searching a file or browsing the Web, we enter a query in terms of strings (or a pattern) representing the information that we want to find. We learned how to represent a set of strings (i.e., a language) in terms of a set property or a grammar. However, neither the set property nor the grammar model is practical for interactive online search. We need a convenient way to succinctly express and efficiently find what we are looking for. This section introduces regular expressions, an algebraic model, which denotes regular languages. Since regular expressions can be transformed to a machine that recognizes the language expressed by the expression (we will learn how in Chapter 7), they are widely used in Web browsing, word processing, compilers, etc. (Unfortunately no such practical model is available for denoting other types of formal languages.)

**Interesting Warning Labels
**

On a knife: Warning. Keep out of children. On a string of Christmas lights: For indoor or outdoor only. On a food processor: Not to be used for the other use. On a chainsaw: Do not attempt to stop chain with your hands. On a child’s superman costume: Wearing of this garment does not enable you fly. - Anonymous -

Break Time

11

Regular Expression

Other Language Models

**Definition: Regular expression
**

Let Σ be a finite alphabet. The regular expressions over Σ and the sets of strings (i.e., languages) they denote are defined as follows. (1) Ø is a regular expression which denotes the empty set. (2) ε is a regular expression which denotes the set {ε }. (3) For each symbol a ∈ Σ , a is a regular expression which denotes the set {a}. (4) If r and s are regular expressions which denote the sets of strings R and S, respectively, then r + s, rs and r* are regular expressions that, respectively, denote the set R ∪ S, RS and R*. In the above definition boldface symbols are used to help the reader distinguish regular expressions from strings. From now on, whenever it is clear under the context, we shall use normal fonts in a regular expression.

12

Regular Expression

Other Language Models

In the definition of the regular expression, operators +, concatenation and *, respectively, correspond to the set union, set product and set closure. The precedence of these operators in a regular expression is in the order (from the highest) of *, concatenation, +. However, we are free to use parentheses in a regular expression, as we do in an arithmetic expression, wherever there is ambiguity. For a regular expression r, by L(r) we denote the language expressed by r. Here are some examples showing regular languages and their representation in terms of a regular expression. Language L(r) {a, b}* {ai | i ≥ 0} {aibj | i, j ≥ 0} {xaaybbz | x, y, z ∈ {a, b}* } Regular expression r (a + b)* a* a*b* (a+b)*aa(a+b)*bb(a+b)*

13

Regular Expression

Other Language Models

Algebraic Properties of Regular expressions

For two regular expressions E and F, define the identity, written E = F, if and only

if their languages are identical, i.e., L(E) = L(F). Regular expressions with operator + or concatenation have the identity laws similar to those of arithmetic algebra with operator + (plus) and multiplication as shown below. Notice that in a regular expression the commutative law does not hold for concatenation. Let r, s and t be arbitrary regular expressions. • Commutative law: r + s = s + r. (But for concatenation, rs ≠ sr.) • Associative law: r + (s + t) = (r + s) + t, (rs)t = r(st). • Distributive law: r(s + t) = rs + rt. • For the regular expressions Ø and ε , we have the following identities. Ø + r = r + Ø = r, Ør = rØ = Ø, ε r = rε = r Notice that Ø and ε , respectively, correspond to 0 and 1 in arithmetic algebra. We leave the proof for the reader.

14

**Other Language Models Rumination (1): syntax flow graph
**

• Occasionally, we hear that programming languages belong to the class of context-free languages. That is true except for some special cases, such as input/output statements. However, in a text book we hardly come across the whole grammar of a programming language. Usually the grammar is informally described, focusing on how to use the statements. However, to develop a system component such as compiler or debugger for a programming language, we need a formal definition of the language. Appendix A shows the whole syntax flow graphs for Pascal. (Pascal is a predecessor of C.)

**Rumination (2): Regular Expression
**

• There are quite a few variations in the form of regular expressions. Here are some examples together with the standard regular expressions. (a | b)* = (a+b)* , abc? = (ab + abc), (a | b)+ = (a+b)(a+b)* (abc)? = abc + ε [ab] = (a+b), [a-z] = a + b + c + . . . . + z ,

With these variations, the integers defined in C can be expressed as follows. digits [0-9] int {digits}+ real {int}”.”{int}([Ee][+-]?{int})? • For a given language, there are infinitely many regular expressions that denote the same language. For example, r, r + r, r + r + r, … , etc. are all equivalent. As shown in the preceding section for algebraic properties of regular expressions, for such simple regular expressions it is possible to prove their equivalence. However, there is no practical algorithm available for proving the equivalence of two arbitrary regular expressions. For example, it is not trivial to prove the following equivalences. In Chapter 7, we will present an idea how to prove them. (a) (r*)* = (r)* (b) r*(r + s)* = ( r + s)* (c) (r + s)* = ( r*s*)*

15

Exercises

Other Language Models

3.1 What is the language of L-system G = ({a, b, c}, h, acb ), where the rules are defined as follows? h (a) = aa h (b) = cb h (c) = a 3.2 Draw a syntax flow graph for the following context-free grammar. S → aSbB | A A → bSa | ba | ab B → bB | ε 3.3 The following syntax flow graph is a simplified version of the syntax flow graph for <expression> of Pascal programming language. (a) Write a grammar which generates the same language defined by this syntax flow graph. (b) Show how expression a1+ (a + b)*a – a/b in Pascal can be derived with the rewriting rules of your grammar. <expression> term + term <factor > <variabl e> letter ) digit factor <letter > <term> factor * / <digit> 0 1

variable expression

letter

a b

(

16

Exercises

Other Language Models

3.4 Briefly describe the language expressed by each of the following regular expressions.

(a) (0 + 1)*00

(b) (1 + 0 )*111( 0 + 1)*

(c) (1 + 0)*(111 + 000)(0 + 1)*

Actual Signs

Outside a muffler ship: “No appointment necessary. We’ll hear you coming.” In a veterinarian’s waiting room: “Be back in 5 minutes. Sit! Stay!”

Break Time

At the electric company: “We would be delighted if you send in you bill. However, if you don’t, you will be (de-lighted).” In a beauty shop: “Dye now!” In a cafeteria: “Shoes are required to eat in the cafeteria. Socks can eat any place else they want.” - Anonymous -

17

- Chapter 00. Preface
- Chapter 01. Preliminary
- Chapter 02. Languages
- Appendix F. CYK Algorithm
- Appendix E. Pumpable CFL
- Appendix D. Property of DCFL
- Appendix C. Computing Regular Expression
- Appendix B. 2-Way Finite Automata
- Appendix A. Pascal Syntax Flow
- Chapter 15. Hierarchy of Models (III)
- Chapter 14. Language Application
- Chapter 13.2. LR Parsing
- Chapter 13.1. LL Parsing
- Chapter 12. Hierarchy of Models (II)
- Chapter 11. Ambiguity of Context-free Grammars
- Chapter 10. Manipulating Context-free Grammars
- Chapter 09. Language Properties
- Chapter 08. Manipulating Finite State Automata
- Chapter 07. Hierarchy of Models
- Chapter 06. Other Automata Models
- Chapter 05. Nondeterministic Automata
- Chapter 04. Automata
- References and Index

Sign up to vote on this title

UsefulNot useful- Procedural City Modeling
- Java Regular Expressions Cheat Sheet
- Laborator11
- PYT Regular Expression
- APM-Best Practices for Application Management With CA Wily Introscope
- Modelling Plant Variation Through Growth
- 06KleeneTheorem
- Worked Solutions
- Operator
- Practical Computing for Biologists
- lec3
- Postgresql 8
- Greibach Normal Form
- 7193282 Greibach Normal Form
- Slides 1
- Openscript Webservices
- The Common Java Cookbook
- Finite Automata - detailed
- Problem Set
- Question Bank for J2SE
- Strings & Things
- Mock Exam 2 for SCJP 6
- Authorization Per User
- Basics of C++
- Aqa Comp1 w Ms Jun09
- Hive Cheat Sheet - Quick reference
- java_netdoc
- Pract List
- CD Viva Questions
- Artifact Project
- Chapter 03. Other Languages