Professional Documents
Culture Documents
Core Competency
Outlines
Introduction to the Theory of Computation
Mathematical Preliminaries and Notation
Languages
Grammars
Automata
Regular languages
Regular grammars
4
Introduction theory of computation…
The first answer is that theory provides concepts and principles that
help us understand the general nature of the discipline. The field of
computer science includes a wide range of special topics, from
machine design to programming.
A second, and perhaps not so obvious, answer is that the ideas we
will discuss have some immediate and important applications. The
fields of digital design, programming languages, and compilers are
the most obvious examples, but there are many others. The
concepts we study here run like a thread through much of computer
science, from operating systems to pattern recognition.
The third answer is one of which we hope to convince the reader.
The subject matters intellectually stimulating and fun. It provides
many challenging, puzzle-like problems that can lead to some
sleepless nights. This is problem solving in its pure essence.
5
Mathematical Preliminaries and Notation
Sets
Set Operations
The usual set operations are union (∪), intersection
(∩), and difference (−) defined as
Relations and Functions
Properties of Relations
Graphs and Trees
Proof techniques
6
Languages and strings
We are all familiar with the notion of natural languages,
such as English and French. Still, most of us would
probably find it difficult to say exactly what the word
“language” means.
Dictionaries define the term informally as a system
suitable for the expression of certain, ideas, facts or
concepts, including a set of symbols and rules for their
manipulation. While this gives us an intuitive idea of
what a language is, it is not sufficient as a definition for
the study of formal languages. We need a precise
definition for the term.
7
Languages and Strings
Symbol: any thing like a,b,c 0,1,2 etc.
Alphabet: collection of symbols ,it must be finite
Alphabet is denoted by sigma()
Example {a , b}, {d, e, f, g}, {0,1,2}
String : sequence of symbols
Example a, b, 0,1 ab,ab ,ba,bb, 01
Language: set of strings
Example Σ ={0, 1}
L1 set of all strings of length 2
= {00, 01,10, 11}
this is finite string
8
Languages and Strings
9
Chapter two: Grammars
outlines
Introduction to grammar
Types of grammar
Regular grammar
Context free grammar
Derivation
11
Grammar
To study languages mathematically, we need a
mechanism to describe them.
informal descriptions in English are often is imprecise
inadequate.
A grammar for the English language tells us whether a
12
Grammar
A grammar G is defined as a quadruple
G =(V, T, S, P),
Where
V is a finite set of objects called variables,
T is a finite set of objects called terminal symbols,
S ∈ V is a special symbol called the start variable,
P is a finite set of productions.
13
Grammar
It will be assumed without further mention that the sets
V and T are nonempty and disjoint.
The production rules are the heart of a grammar; they
specify how the grammar transforms one string into
another, and through this they define a language
associated with the grammar.
In our discussion we will assume that all production
rules are of the form x→y
where x is an element of (V ∪ T)+ and y is in (V ∪ T)*
The productions are applied in the following manner:
Given a string w of the form w = uxv
14
Grammar
Example 1
Consider the grammar: G = ({S},{a,b},S,P}, with P
given by
S→aSb
S→λ then
S⟹aSb⟹aaSbb⟹aabb, so we can write
S⇒aabb
The string aabb is a sentence in the language generated by G,
15
Grammar
Example 1
G =({S, A}, {a, b}, S, P), with production
S→Ab
A→aAb
A→λ
1.What are the tuples?
2. What is the string generated from the given production?
16
Types of grammar
There are different types of Grammars, examples:
Linear Grammars
Nonlinear Grammars
17
Types of grammars
Linear Grammars are Grammars that have at most one
variable at the right side production of the grammar.
There are different types of linear grammars based on
the positions of the variable at the right side of the
production.
Nonlinear grammars are grammars that have more than
one variable at the right side of the production.
18
Right-Linear and Left-Linear Grammars
Definition:
A grammar G =(V, T, S, P) is said to be right-linear if
all productions are of the form
A → xB,
A → x,
where A, B ∈ V, and x ∈ T*.
A grammar is said to be left-linear if all productions are
of the form
A → Bx,
or
A → x.
19
Right-Linear and Left-Linear Grammars
20
Noam Chomsky types of grammar
Noam Chomsky gave a mathematical model of grammar which
is effective for writing computer languages.
The four types of grammar according to Noam Chomsky are:-
21
Chapter three: Regular Languages
Outlines
Regular languages
Finite automata
Types of finite automata
Regular languageoperations
Conversion from NFA to DFA
23
Automata
An automaton is an abstract model of a digital computer.
As such, every automaton includes some essential features.
It has a mechanism for reading input. It will be assumed that the
input is a string over a given alphabet, written on an input file,
which the automaton can read but not change.
The input file is divided into cells, each of which can hold one
symbol.
The input mechanism can read the input file from left to right, one
symbol at a time.
The input mechanism can also detect the end of the input string (by
sensing an end-of-file condition).
24
Finite automata
Our introduction in the above section about
automata, is brief and informal.
At this point, we have only a general understanding of
what an automaton is and how it can be represented by
a graph.
To progress, we must be more precise, provide formal
definitions, and start to develop rigorous results.
We begin with finite accepters, which are a simple,
special case of the general scheme introduced in
above section.
25
Finite automata
This type of automaton is characterized by having no
temporary storage.
Since an input file cannot be rewritten, a finite
automaton is severely limited in its capacity to
“remember” things during the computation.
A finite amount of information can be retained in the
control unit by placing the unit into a specific state.
But since the number of such states is finite, a finite
automaton can only deal with situations in which the
information to be stored at any time is strictly
bounded.
26
Finite Automata
27
Deterministic Finite Accepters (DFA)
The first types of automaton that we are going to
study in detail are finite accepters that are deterministic
in their operation.
We start with a precise formal definition of
deterministic accepters.
28
Deterministic Finite Accepters (DFA)
Definition : A deterministic finite accepter or dfa is
defined by the quintuple(5 tuples)
M = (Q, Σ, δ, q0, F), Where
δ :Q × Σ → Q is a transition function,
29
Deterministic Finite Accepters (DFA)
Example 1
M =({q0,q1,q2},{0, 1},δ,q0,{ql}), }),
Where δ is given by
δ (q0,0) = q0, δ (q0,1) = q1
δ(q1,0) = q0, δ (q1,1) = q2
δ (q2,0) = q2, δ(q2,1) = q1
30
Deterministic Finite Accepters (DFA)
Example 2: Consider the dfa below
31
Deterministic Finite Accepters (DFA)
Example 3: Find a deterministic finite accepter that
recognizes the set of all strings on Σ= {a,b} starting with
the prefix ab.
32
Nondeterministic Finite Accepters (NFA)
Finite accepters are more complicated if we allow them
to act non-deterministically.
Non-determinism is a powerful but, at first sight,
unusual idea.
We normally think of computers as completely
deterministic, and the element of choice seems out of
place. Nevertheless, non-determinism is a useful
notion, as we shall see as we proceed.
33
Definition of a Nondeterministic Accepter
34
Definition of a Nondeterministic Accepter
Definition
A nondeterministic finite accepter or nfa is defined by
35
Definition of a Nondeterministic Accepter
Note
There are three major differences between this
definition and the definition of a dfa.
In a nondeterministic accepter, the range of δ is in the
power set 2Q, so that its value is not a single element of
Q but a subset of it.
This subset defines the set of all possible states that can
36
Definition of a Nondeterministic Accepter
then either q0 or q2 could be the next state of the nfa.
Also, we allow λ as the second argument of δ.
This means that the nfa can make a transition without
consuming an input symbol.
Although we still assume that the input mechanism can
only travel to the right, it is possible that it is stationary
on some moves.
Finally, in an nfa, the set δ (q ,a) may be empty,
meaning that there is no transition defined for this
specific situation.
37
Definition of a Nondeterministic Accepter
38
Equivalence of Deterministic and
Nondeterministic Finite Accepters
We now come to a fundamental question. In what sense are dfa's
and nfa's different? Obviously, there is a difference in their
definition, but this does not imply that there is any essential
distinction between them.
To explore this question, we introduce the concept of
equivalence between automata.
Definition: Two finite accepters, M1 and M2, are said to be
equivalent, if they both accept the same language
L(M1) = L(M2),
As mentioned, there are generally many accepters for a given
language, so any dfa or nfa has many equivalent accepters.
39
Conversion of NFA to DFA
40
Context-Free Languages
Context-free language
In the last chapter, we discovered that not all languages are
regular. While regular languages are effective in describing
certain simple patterns, one does not need to look very far for
examples of non regular languages.
The topic of context-free languages is perhaps the most
important aspect of formal language theory as it applies to
programming languages. Actual programming languages have
many features that can be described elegantly by means of
context-free languages.
What formal language theory tells us about context-free
languages has important applications in the design of
programming languages as well as in the construction of
efficient compilers.
42
Context-free language
In formal language theory, context free language is a
language generated by context-free grammar.
The set of CFL is identical to the set of languages
accepted by pushdown automata
43
Context-Free Grammars
The productions in a regular grammar are restricted in
two ways:
The left side must be a single variable, while the right
side has a special form.
To create grammars that are more powerful, we must
relax some of these restrictions.
By retaining the restriction on the left side, but
permitting anything on the right, we get context-free
grammars.
44
Context free grammar
Definition
Context free grammar is defined by 4 tuples as G = (V, T, S, P) where
V = Set of variables or non-terminals
T = Set of terminals
S = Start symbol
P = production rules
A→x
where A ∈ V and x ∈ (V ∪ T)*
A language L is said to be context-free if and only if there
is a context free grammar G such that L= L (G).
45
Context free language
46
Pumping lemma
47
Derivation
48
Ambiguity
A grammar produces more than one parse tree for a
sentence is called as an ambiguous grammar.
produces more than one leftmost derivation or
more than one rightmost derivation for the same sentence
(input).
49
Ambiguity: Example
Example: The arithmetic expression grammar
E → E + E | E * E | ( E ) | id
permits two distinct leftmost derivations for the
sentence id + id * id:
(a) (b)
E => E + E E => E * E
=> id + E => E + E * E
=> id + E * E => id + E * E
=> id + id * E => id + id * E
=> id + id * id => id + id * id
50
Ambiguity: example
E E + E | E E | ( E ) | - E | id
Construct parse tree for the expression:
id + id id
E E E E
E + E E + E E + E
E E id E E
id id
E E E E
E E E E E E
E + E E + E id
Which parse tree is correct?
id id
51
Ambiguity: example…
E E + E | E E | ( E ) | - E | id
id E E
A grammar that produces more than one
id id
parse tree for any input sentence is said
to be an ambiguous grammar. E
E + E
E E id
id id
52
Parsing
What is parsing?
Parsing: To break a sentence down into its component
parts with an explanation of the form, function, and
syntactical relationship of each part.
The syntax of a programming language is usually given
by the grammar rules of a context free grammar (CFG).
53
Parsing…
The parser can be categorized into two groups:
Top-down parser
The parse tree is created top to bottom, starting from the
root to leaves.
Bottom-up parser
The parse tree is created bottom to top, starting from the
leaves to root.
Both top-down and bottom-up parser scan the input from
left to right (one symbol at a time).
Efficient top-down and bottom-up parsers can be
implemented by making use of context-free- grammar.
LL for top-down parsing
LR for bottom-up parsing
54
Derivation
A derivation is a sequence of replacements of structure
names by choices on the right hand sides of grammar
rules.
Example: E → E + E | E – E | E * E | E / E | -E
E→(E)
E → id
E => E + E means that E + E is derived from E
- we can replace E by E + E
- we have to have a production rule E → E+E in our
grammar.
E=>E+E =>id+E=>id+id means that a sequence of
replacements of non-terminal symbols is called a
derivation of id+id from E.
55
Parse tree
A parse tree is a graphical representation of a derivation.
It filters out the order in which productions are applied to
replace non-terminals.
A parse tree corresponding to a derivation is a labeled
tree in which:
the interior nodes are labeled by non-terminals,
56
Parse tree and Derivation
Grammar E E + E | E E | ( E ) | - E | id
Lets examine this derivation:
E -E -(E) -(E + E) -(id + id)
E E E E E
- E - E - E - E
( E ) ( E ) ( E )
E + E E + E
This is a top-down derivation
because we start building the id id
parse tree at the top parse tree
Elimination of ambiguity
These two derivations point out a problem with the grammar:
The grammar do not have notion of precedence, or implied order of
evaluation
precedence
Create a non-terminal for each level of precedence
Isolate the corresponding part of the grammar
Force the parser to recognize high precedence sub expressions first
For algebraic expressions
Multiplication and division, first (level one)
Subtraction and addition, next (level two)
Association
Left-associative : The next-level (higher) non-terminal places at the
last of a production
58
Elimination of ambiguity
Therefore, we can remove all undesirable productions
using the following sequence of steps:
1. Remove λ-productions.
2. Remove unit-productions.
3. Remove useless productions.
59
Two Important Normal Forms
60
Chomsky Normal Form
One kind of normal form we can look for is one in which the
number of symbols on the right of a production is strictly limited.
In particular, we can ask that the string on the right of a
production consist of no more than two symbols. One instance of
this is the Chomsky normal form.
Definition: A context-free grammar is in Chomsky normal form
if all productions are of the form
A → BC
Or
A → a,
where A, B, C are in V, and a is in T.
61
Greibach Normal Form
Another useful grammatical form is the Greibach normal form.
Here we put restrictions not on the length of the right sides of a
production, but on the positions in which terminals and variables
can appear.
Arguments justifying Greibach normal form are a little
complicated and not very transparent.
Similarly, constructing a grammar in Greibach normal form
equivalent to a given context-free grammar is tedious. We
therefore deal with this matter very briefly. Nevertheless,
Greibach normal form has many theoretical and practical
consequences.
62
Greibach Normal Form
Definition
A context-free grammar is said to be in Greibach
normal form if all productions have the form
A → ax,
where a ∈ T and x ∈ V*
If we compare this with Definition s-grammars, we see
that the form A → ax is common to both Greibach
normal form and s-grammars, but Greibach normal
form does not carry the restriction that the pair (A, a)
occur at most once.
This additional freedom gives Greibach normal form a
generality not possessed by s-grammars.
63
Pushdown Automata
Introduction
What is Pushdown Automata?
A Pushdown Automata(PDA) a way to implement a
context free grammar
It is more powerful than FSM
Stack operations
A stack is a way we arrange elements one on top of another
A stack has two basic operations
PUSH:a new element is added on top of the stack
POP: the top element of the is read and removed
65
Pushdown Automata
Pushdown Automata has three components
In input tape
Finite control unit
A stack with infinite size
66
Pushdown automata
Formal definition
Pushdown automata is formally defined by 7 tuples as shown below
Where
67
Pushdown automata…
The output of δ is finite set of pairs(P, Y) where
P is new state
Y is a sting of stack symbols that replaces x at the top of
the stack
Example
If Y = ∈ then the is popped
If Y = x then the stack is unchanged
If Y = yz then x is replaced by z and Y is pushed onto the
stack
68
Nondeterministic Pushdown Automata
69