You are on page 1of 40

Formal Language and Automata

Theory

Chapter -2: Finite Automata

Compiled By Abdella K.(MSc)


07/10/2021 Formal Language and Automata Theory 1
Introduction
• At this point, we have only a general understanding of what an automaton is. To
progress, let us provide more formal and precise definitions, and start to develop
rigorous results.
• We begin with finite acceptors, which are a simple case of the general scheme
discussed already in the previous chapter.
• This type of automaton is characterized by having no temporary storage. Since, an
input file cannot be rewritten; a finite automaton is severely limited in its capacity
to “remember” things during the computation.
• A finite amount of information can be retained in the control unit by placing the
unit into a specific state.
• But since the number of such states is finite, a finite automaton can only deal with
situations in which the information to be stored at any time is strictly bounded.
• The figure given below is an example of Finite Automata, which accepts all legal
Pascal identifiers.

07/10/2021 Formal Language and Automata Theory 2


Introduction(cont..)
• Some interpretation is necessary:
– We assume that initially the automaton is in initial state; we indicate this by
drawing an arrow to this state.
– When the automaton is either accepting or rejecting the input string, it is
said to be in any one of the final states; we indicate this by drawing two
concentric circles.
• As always, the string to be examined is read from left to right, one
character at each step; when the first symbol is a letter, the
automaton goes into state 2, after which the rest of the string can be
either letter or digit.
• Finally, when the entire string is processed the state 2 “accepts” the
string or represents the “yes” state of the automaton.
• Conversely, if the first symbol is a digit, the automaton will go into
state 3, the “reject” or “no” state, and remain there.
• In our solution we assume that no input other than letters or digits is
possible.
07/10/2021 Formal Language and Automata Theory 3
Deterministic Finite Automata
• The first type of automaton we study in detail are Finite
Automata that are deterministic in their operation.
• A Deterministic Finite Automata or DFA is defined by the
Quintuple
M = (Q, Σ, δ, q0, F)
Where, Q is finite set of internal states
Σ is a finite set of symbols called the input alphabet
δ : →∑ QQX is a total function called the transition
function
q0 ∈ Q is the initial state
F ⊆ Q is a set of Final states

07/10/2021 Formal Language and Automata Theory 4


Deterministic Finite Automata(cont..)
• A Deterministic Finite Automata operates in the
following manner
At the initial time, it is assumed to be in the initial state
q0, with its input mechanism on the leftmost symbol of
the input string.
During each move of the automaton, the input
mechanism advances one position to the right, so each
move consumes one input symbol.
When the end of the string is reached, the string is
accepted if the automaton is in one of its final states.
Otherwise the string is rejected.
The input mechanism can move only from left to right
and reads exactly one symbol on each step.
07/10/2021 Formal Language and Automata Theory 5
Deterministic Finite Automata(cont..)
• To visualize and represent finite automata, we use transition
graphs, in which the vertices or nodes represent states and the
edges represent transitions.
• The labels on the vertices are the names of the states, while the
labels on the edges are the current values of the input symbols.
• For example, if q0 and q1 are internal states of some DFA M,
then the graph associated with M will have one vertex labeled q 0
and another labeled q1.
• An edge (q0, q1) labeled ‘a’ represents the transition δ(q0, a) =
q 1.
• The initial state will be identified by an incoming unlabeled
arrow not originating at any vertex and final states are identified
with double circle.
07/10/2021 Formal Language and Automata Theory 6
Deterministic Finite Automata(cont..)
• More formally, if M = (Q, Σ, δ, q0, F) is a DFA, then its associated transition
graph Gm has exactly |Q| vertices, each one labeled with a different qi ∈ Q.
• For every transition rule δ(q i, qj) labeled ‘a’.
• The vertex associated with q0 is called the initial vertex, while those labeled
with qf ∈ F are the final vertices.
• The transition graph given below represents the DFA
M = ({q0, q1, q2}, {0, 1}, δ, q0, {q1})
where δ is given by δ(q0, 0) = q0, δ(q0, 1) = q1, δ(q1, 0) = q0, δ(q1, 1) = q2,
δ(q2 , 0) = q2 , δ(q2 , 1) = q1
• This DFA accepts the input string 01. The DFA does not accept the string 00,
since, after reading two consecutive 0’s, it will be in state q 0.
• By similar reasoning, we see that the automaton will accept the strings 101,
0111, and 11001, but not 100 or 1100.

07/10/2021 Formal Language and Automata Theory 7


Deterministic Finite Automata(cont..)
• It is convenient to introduce the extended transition function δ* : Qx∑*
→Q
• The second argument of δ* (i.e. ∑*) is a string, rather than a single symbol, and its
value gives the state the automaton will be in after reading that string.
• For example, if δ(q0,a)=q1, and δ( q1, b)=q2 then δ*(q0,ab) = q2
• Formally, we can define δ* recursively by,
δ *(q, ε)=q Equ. (2.1)
δ *(q, wa)= δ (δ *(q, w),a) Equ (2.2)
for all q ∈ Q, w ∈ Σ*, a ∈ Σ.
• To see why this is appropriate, let us apply these definitions to the simple case
above
δ *(q0, ab)= δ (δ *(q0, a),b) Equ. (2.3)
But,
δ *(q0, a)= δ (δ *(q0, ε),a)= δ(q0, a)
δ*(q0, a) =q1 Equ.(2.4)
• Now, substituting Equ (2.4) in Equ (2.3), we get, δ *(q0, ab)=δ (q1, b)=q2 as
expected.
07/10/2021 Formal Language and Automata Theory 8
Languages and DFAs
• Having made a precise definition of an automaton, we are now ready to define
formally what we mean by an associated language.
• The association is obvious: the language is the set of all strings recognized by
the automaton.
• The language accepted by a DFA M = (Q, Σ, δ, q0, F) is the set of all strings on
Σ accepted by M.
• In formal notation, L(M) = {w ∈ Σ* : δ*(q0, w) ∈ F}
• A DFA will process every string in Σ* and either accept it or not accept it.
• Non acceptance means that the DFA stops in a non final state, so that, L(M) =
{w ∈ Σ* : δ*(q0, w) ∉ F}
• Example: Consider the DFA shown in the figure below:

07/10/2021 Formal Language and Automata Theory 9


Languages and DFAs(cont..)
• In the above figure, two labels were allowed on a single edge. Such
multiple labeled edges are short hand for two or more distinct transitions:
the transition is taken whenever the input symbol matches any of the
edge labels.
• For example, the automaton in the above figure remains in its initial state
q0 until the first ‘b’ is encountered.
• If this is also the last symbol of the input, then the string is accepted.
• If not, the DFA goes into state q2, from which it can never escape. Such a
state is called a trap state.
• We can see clearly from the graph that the automaton accepts all strings
which have exactly one b, located at the rightmost position of the string.
• All other input strings are rejected. In set notation, the language
accepted by this automaton is
L = {an b| n ≥ 0}

07/10/2021 Formal Language and Automata Theory 10


Theorem
• Let M = (Q, Σ, δ, q0, F) be a deterministic finite
acceptor, and let Gm be its associated transition
graph. Then for every qi, qj ∈ Q, and w ∈ Σ+
_δ*(qi,w)=qj if and only if there is in Gm a walk
with label w from qi to qj.

07/10/2021 Formal Language and Automata Theory 11


Proof of the Theorem
• It can be proved rigorously using an induction on the length of w.
• Assume that the claim is true for all strings v with |v|≤n.
• Consider then any string w of length n+1 and represent it as w = va.
• Suppose now that δ*(qi,v)=qk. Since, |v| = n, there must be a walk in
Gm labeled v from qi to qk.
• But if δ*(qi,w)=qj, then M must have a transition δ(qk,a)=qj, so that by
construction Gm has an edge (qk,qj) with label ‘a’.
• Thus, there is a walk in Gm labeled va= w between qi and qj.
• Since, the result is obviously true for n = 1, we can claim by induction
that, for every w ∈ Σ+, δ*(qi,w)=qj [Equ.(2.5)] implies that there is a
walk in Gm from qi to qj labeled w.
• The argument can be turned around in a straight forward way to show
that the existence of such a path implies the Equ (2.5), thus
completing the proof.
07/10/2021 Formal Language and Automata Theory 12
Example
• Find a Deterministic Finite Acceptor that recognizes the set of all strings on
Σ={a,b} starting with the string ab.
• The only issue here is the first two symbols in the string; after they have
been read no further decisions need to be made.
• We can therefore solve the problem with an automaton that has four states;
an initial state, two states for recognizing ‘ab’ ending in a final trap state,
and one non final trap state.
• If the first symbol is an ‘a’ and the second is a ‘b’, the automaton goes to the
final trap state, where it will stay since the rest of the input does not matter.
• On the other hand, if the first symbol is not an ‘a’ or the second one is not a
‘b’, the automaton enters the non final trap state.

07/10/2021 Formal Language and Automata Theory 13


Regular Languages
• Every Finite automaton accepts some language.
• If we consider all possible finite automata, we get a
set of languages associated with them.
• We will call such a set of languages a family.
• The family of languages that is accepted by
deterministic finite automata is quite limited.
• A language L is called regular if and only if there
exists some deterministic finite accepter M such that
L = L(M)
• Precisely, a language can be claimed as regular if a
DFA can be constructed for it.
07/10/2021 Formal Language and Automata Theory 14
Regular Languages(cont..)
• For example: Show that the language
L={awa|w ∈ {a,b}*} is regular.

07/10/2021 Formal Language and Automata Theory 15


Nondeterministic Finite Automata
• Non-determinism means a choice of moves for an automaton. Rather
than prescribing a unique move in each situation, we allow a set of
possible moves.
• Formally, we achieve this by defining the transition function so that it
range is a set of possible states.
• A Nondeterministic Finite Automata or NFA is defined by the quintuple.
M = (Q, Σ, δ, q0, F)
where,
Q is a finite set of internal states
Σ is a finite set of symbols called the input alphabet
q0 ∈ Q is the initial state
F ⊆ Q is a set of final states
but,
δ: Q X (Σ ∪ {ε}) → 2Q
07/10/2021 Formal Language and Automata Theory 16
Differences between NFA and DFA
• Note that there are two major differences between the
above NFA definition and the definition of a DFA.
1. In a NFA, the range of δ is the power set 2 Q, so that
its value is not a single element of Q, but a subset of
it. This subset defines the set of all possible states
that can be reached by the transition. For instance,
δ(q1, a) = {q0, q2}.
2. Also we allow ε as the second argument of δ. This
means that the NFA can make a transition without
consuming an input symbol.

07/10/2021 Formal Language and Automata Theory 17


Differences between NFA and DFA(cont...)
• Like DFAs, NFA can also be represented by transition graphs.
• The vertices are determined by Q, while an edge (q i, qj) with label ‘a’ is in the
graph if and only if δ(qi, a) contains qj.
• Note that since ‘a’ may be the empty string, there can be some edges labeled ε.
• A string is accepted by an NFA if there is any sequence of possible moves that
will put the machine in a final state at the end of the string.
• Non-determinism can therefore be viewed as involving “intuitive” insight by
which the best move can be chosen at every stage (assuming that the NFA wants
to accept every string).
• For example consider the transition graph shown in the figure below. It describes
a nondeterministic automaton since there are two transitions labeled ‘a’ out of q 0.

07/10/2021 Formal Language and Automata Theory 18


Differences between NFA and DFA(cont...)
• Another nondeterministic automaton is shown
in the figure below.

• It is nondeterministic not only because several


edges with the same label originate from one
vertex, but also it has ε - transition.

07/10/2021 Formal Language and Automata Theory 19


Differences between NFA and DFA(cont...)
• Again, the transition function can be extended so its second
argument is a string.
• We require of the extended transition function δ* that if δ* (q i, w) =
Qj then Qj is the set of states the automaton may be in, having started
in state q all possible states and having read ‘w’.
• For an NFA, the extended transition function is defined so that δ*(qi,
w) contains qj if and only if there is a walk in the transition graph
from qi to qj labeled w.
• This holds for all qi, qj ∈ Q and w ∈ Σ*
• The language L accepted by an NFA M = (Q, Σ, δ, q0, F) is defined
as the set of all strings on Σ accepted by M.
• Formally, L(M) = {w ∈ Σ* : δ*(q0, w) ∩ F ≠ ∅}.
• In words, the language contains all strings ‘w’ for which there is a
walk labeled w from the initial vertex of the transition graph to some
final vertex. Where, ∅Formal
07/10/2021
denotes the empty set.
Language and Automata Theory 20
Equivalence of Deterministic and Nondeterministic
Finite Automata
• Starting now to discuss the fundamental concept – In what sense are DFAs
and NFAs different? Is there any other difference other than the
definition?
• To explore this question, we introduce the notion of equivalence between
automata.
• We say that two automata are equivalent if they accept the same language.
• For any given language, we can usually find an unlimited number of
equivalent automata – deterministic or not.
• The DFA shown in Fig.(a) is equivalent to the NFA shown in Fig.(b) since,
they both accept the language {(10)n/n≥0}.

07/10/2021 Formal Language and Automata Theory 21


Equivalence of Deterministic and Nondeterministic
Finite Automata
• When we compare different classes of automata, the question invariably
arises whether one class is more powerful than the other.
• By more powerful we mean that tan automaton of one kind can achieve
something that cannot be done by any automaton of the other kind.
• Since, a DFA is a restricted kind of NFA, it is clear that any language that is
accepted by a DFA is also accepted by some NFA.
• But the converse is not so obvious.
• This means that we can actually give a way of converting any NFA into an
equivalent DFA.
• The rationale for the construction is the following. After an NFA has read a
string ‘w’, we may not know exactly what state it will be in, but we can say
that it must be in one state of a set of possible states, say {qi, qj, ……, qk}.
• An equivalent DFA after reading the same string must be in some definite
state.

07/10/2021 Formal Language and Automata Theory 22


Equivalence of Deterministic and Nondeterministic
Finite Automata
• How can we make these two situations correspond?
• The solution is to label the states in such a way that, after
reading w, the equivalent DFA will be in a single state
labeled {qi, qj, ……, qk}.
• Since for a set of |Q| states there are exactly 2|Q|subsets, the
corresponding DFA will have a finite number of states.
• We now present an algorithm for constructing from an
NFA a DFA that recognizes the same language.
• This algorithm is often called as “subset construction”, is
useful for simulating an NFA by a computer program.

07/10/2021 Formal Language and Automata Theory 23


Algorithm: Subset Construction
Constructing a DFA from an NFA
• Input: An NFA N.
• Output: A DFA D accepting the same language.
• Method: Our algorithm constructs a transition table Dtran for D.
• Each DFA state is a set of NFA states and we construct Dtran so
that D will simulate “in parallel” all possible moves N can make
on a given input string.
• We use the following three operations to keep track of set of NFA
states (S represents an NFA state and T a set of NFA states).

07/10/2021 Formal Language and Automata Theory 24


Algorithm: Subset Construction
• Before it sees the first input symbol, N can be in
any of the states in the set ε-closure(S0), where S0
is the start state of N.
• Suppose that exactly the states in a set T are
reachable from S0 on a given sequence of input
symbols, and let ‘a’ be the next input symbol.
• On seeing ‘a’, N can move to any of the states in
the set move(T, a).
• When we allow for ε-transitions, N can be in any
of the states in ε-closure(move(T, a)), after seeing
the ‘a’.
07/10/2021 Formal Language and Automata Theory 25
Algorithm: Subset Construction
• Initially, ε-closure(S0) is the only state in Dstates and it is
unmarked;
While there is an unmarked state T in Dstates do
Begin
Mark T;
For each input symbol ‘a’ do
Begin
U := ε-closure(move(T, a));
If u is not in Dstates then add U as an
unmarked state to Dstates;
Dtran[T, a] := U
End
End

07/10/2021 Formal Language and Automata Theory 26


Algorithm: Subset Construction
• We construct Dstates, the set of states of D, and Dtran,
the transition table for D, in the following manner.
• Each state of D corresponds to a set of NFA states that
N could be in after reading some sequence of input
symbols including all possible ε-transitions before or
after symbols are read.
• The start state of D is ε-closure(S0).
• States and transitions are added to D using the
algorithm given above.
• A state of D is an accepting state if it is a set of NFA
states containing at least one accepting state of N.

07/10/2021 Formal Language and Automata Theory 27


Algorithm to compute ε-closure
Push all states in T onto stack;
Initialize ε-closure(T) to T:
While stack is not empty do
Begin
Pop t, the top element, off of stack;
For each state u with an edge from t to u labeled ε do
If u is not in ε-closure(T) do
Begin
Add u to ε-closure(T);
Push u onto stack;
End
End
07/10/2021 Formal Language and Automata Theory 28
Algorithm to compute ε-closure
• The computation of ε-closure(T) is a typical
process of searching a graph for nodes reachable
from a given set of nodes.
• In this case the states of T are the given set of
nodes, and the graph consists of just the ε-
labeled edges of the NFA.
• A simple algorithm to compute ε-closure(T) uses
a stack to hold states whose edges have not been
checked for ε-labeled transitions.

07/10/2021 Formal Language and Automata Theory 29


Example
• Figure given below shows a NFA N accepting the language
(a/b)*abb. Now let us apply the “subset construction” algorithm to
N.

• The start symbol of the equivalent DFA is ε-closure(0) – all the states
reachable from state 0 via a path in which every edge is labeled ε.
Note that a path can have no edges, so 0 is reached from itself by such
a path.
07/10/2021 Formal Language and Automata Theory 30
Example(cont..)
• ε-closure(0) = {0, 1, 2, 4, 7} = A Let ε-closure(0) be named as “A”.
• The input symbol alphabet here is {a, b}.
• The algorithm tells us to mark A and then to compute ε-closure(move(A, a)).
• We first compute move(A, a) – the set of states, N, having transitions on ‘a’
from members of A.
• Among the states 0, 1, 2, 4 and 7 (the members of A), only 2 and 7 have such
transitions, to 3 and 8 so, move(A, a) = {3, 8}
• Then compute
• ε-closure({3, 8}) – all the states reachable from the states 3 and 8, separately,
via a path in which every edge is labeled ε.
• ε-closure(move(A, a)) = ε-closure({3, 8})
• = {1, 2, 3, 4, 6, 7, 8} = B
• Let us call this set B. Thus Dtran[A, a] = B.
• Repeat the same process for the input symbol ‘b’ and compute ε-
closure(move(A, b)).
• Among the states in A, only 4 has a transition on b to 5. So, move(A, b) =
{5}.
07/10/2021 Formal Language and Automata Theory 31
Example(cont..)
• ε-closure(move(A, b)) = ε-closure({5})
= {1, 2, 4, 5, 6, 7} = C
• Let us call it as C and thus Dtran[A, b] = C.
• Apply the same procedure to the newly derived sets
B and C. As a result we obtain 5 different sets
A = {0, 1, 2, 4, 7}
B = {1, 2, 3, 4, 6, 7, 8}
C = {1, 2, 4, 5, 6, 7}
D = {1, 2, 4, 5, 6, 7, 9}
E = {1, 2, 4, 5, 6, 7, 9’}
• State A is the start state, and state E is the only
accepting state.

07/10/2021 Formal Language and Automata Theory 32


Reducing the Number of States in a DFA
• Any DFA defines a unique language, but the converse is not true.
For a given language, there are many DFAs that accept it.
• There may be considerable difference in the number of states of
such equivalent automata.
• An important theoretical result is that every regular set is recognized
by a minimum state DFA that is unique up to state names.
• We say that string ‘w’ distinguishes state ‘s’ from state ‘t’ if, by
starting with the DFA M in state ‘s’ and feeding it input ‘w’,. we end
up in an accepting state, but starting in state ‘t’ and feeding it input
‘w’, we end up in a non accepting state, or vice versa.
• For example, the states A and B in the DFA represented in the above
figure are distinguished by the input bb, since A goes to the non
accepting state C on input bb, while B goes to the accepting state E
on that same input.
07/10/2021 Formal Language and Automata Theory 33
Reducing the Number of States in a DFA
• Our algorithm for minimizing the number of states of a
DFA works by finding all groups of states that can be
distinguished by some input string.
• Each group of states that cannot be distinguished is then
merged into a single state.
• Initially, the partition consists of two groups: the accepting
states and the non accepting states.
• The algorithm works by maintaining and refining a
partition of the set of states.
• Each group of states within the partition consists of states
that have not yet been distinguished from one another, and
any pair of states chosen from different groups has been
found distinguishable by some input.
07/10/2021 Formal Language and Automata Theory 34
Algorithm: Minimizing the number of states of a DFA
• Input: A DFA M with set of States S, set of input
transitions defined for all states and inputs, start
state S0, and set of accepting states F
• Output: A DFA M’ accepting the same language
as M and having as few states as possible
• Method:
1. Construct an initial partition π of the set if
states with two groups: the accepting s w states
F and the non accepting states S-F.

07/10/2021 Formal Language and Automata Theory 35


Algorithm: Minimizing the number of states of a DFA
2. Apply the procedure “Construct πnew” to the initial partition π to
construct a new partition π .
3. If πnew= π, let πfinal = π and continue with step (4). Otherwise,
repeat step (2) with π = πnew.
4. Choose one state in each group of the partition πfinal as the
representative for that group. The representatives will be the states
of the reduced DFA M’ Let s be a representative state, and suppose
on input ‘a’ there is a transition of M from s to t. Let r be the
representative of t’s group (r may be t). Then M’ has a transition
from s to r on ‘a’. Let the start state of M’ be the representative of
the group containing the start state s0 of M, and let the ’accepting
states of M be the representatives that are in F.
5) If M’ has a dead state, that is, a state d that is not accepting and that
has transitions to itself on all input symbols, then remove d from M’.
(Also remove any states not reachable from the start state). Any
transitions to ‘d’ from Formal
07/10/2021 otherLanguage
states become
and Automata undefined.
Theory 36
Procedure: Construct πnew
• For each group G of π do
Begin
Partition G into subgroups such that two groups s and t of G
are in the same group if and only if for all input symbol ‘a’,
state s and t have
transitions on ‘a’ to states in the same group
of π;
/*at worst, a state will be in a subgroup by itself */
Replace G in πnew by the set of all subgroups formed.
End

07/10/2021 Formal Language and Automata Theory 37


Example
• Let us consider the DFA represented in Fig. 2.11. The initial partition π
consists of two groups: (E), the accepting state, and ()A B C D) non
accepting states.
π=(ABCD)(E)
• To construct πnew the algorithm “Procedure: Construct new” first consider
(E). Since, this group consists of a it cannot be split further, so (E) is
placed in πnew. the algorithm then considers the group (A B C D). On input
‘a’, each of these states has a transition to B, so they could all remain, in
one group as far as input ‘a’ is concerned. On input ‘b’, however, A, B,
and C go to members of the group (A B C D) of π, while D goes to E, a
member of another group. Thus, in π new the group (A B C D) must be
split into two new groups as (A B C) and (D).
πnew = (A B C) (D) (E)
π= πnew
• In the next pass through the algorithm of “procedure : Construct π new” the
groups (D) and (E) cannot be split further as they contain only single state.
07/10/2021 Formal Language and Automata Theory 38
Example
• The algorithm then considers the group (A B C). Again the group cannot be split on input
‘a’ is concerned, but on input ‘b’ the group need to be split as (A C) and (B), since A and C
each have a transition to C, while B has a transition to D, a member of a group of the
partition different from that of C.
π = (A B C) (D) (E)
πnew = (A C) (B) (D) (E)
π = πnew
• In the next pass of the algorithm, we cannot split any of the groups consisting of a single
state. The only possibility is to try to split (A C). But A and C, each go to the same state B
on input ‘a’ and to the state C on input b. Hence, after this pass
πfinal = π = (A C) (B) (D) (E).
• If we choose A as the representative for the group (A C), and choose B, D, and E for the
singleton groups, then we obtain the reduced automaton as shown in the transition table
below:

07/10/2021 Formal Language and Automata Theory 39


Thank you!!

07/10/2021 Formal Language and Automata Theory 40

You might also like