You are on page 1of 4

Module 2: Regular Languages, Regular Expressions (continued)

Lecture notes by Dr. K S Sudeep

Theorem: Regular expressions correspond to regular languages.

Proof:

Part 1: Given a regular expression R, how to construct an equivalent DFA


A that recognizes L(R)? (Continued)

We covered two base cases in the previous set of notes. Now let us
consider the third case:
The next case is expressions of the form R = a. L(R) = {a}.
i.e., A regular expression of the form ‘a’ where a is a symbol in the input
alphabet Ʃ. For example, consider the expression 0. The language
contains a single string, {0}.

We can design a corresponding DFA as follows.

DFA A0

Verify that this is a DFA that does not accept any string except ‘0’. In other words,
the language of this DFA is {0}.
If we try to design an NFA instead, it would look simpler.

NFA for the language {0}

Note that we could do away with the ‘dead’ state q2. Such an NFA is enough for
our proof, as we have already proved that any NFA (with or without epsilon
moves) can be converted to an equivalent DFA. So, for the rest of this proof, we
will be constructing NFAs with epsilon moves.

Now that we have covered the three small cases (base cases), we move to the
Induction steps:

1. Let us assume there exists an NFA for regular expression A. How can we
construct an NFA for the regular expression A*?
This can be done as follows (see figure).

NFA for A*

Here, the grey part inside corresponds to the NFA for regular expression
A. q0 is the initial state in that NFA and qa1, qa2 are accepting states.
We add two new states: q0’ is the initial state in the new NFA we are
constructing for expression A*. Similarly, qa is the new accept state.
We add epsilon arrows from every (old) accepting states to the (old) initial
state, and we also add one epsilon arrow from the new initial state to the
newly added accepting state. This way, the empty string is accepted (as is
required), and for any number of strings w1, w2, w3, …, wk in A, the
concatenation ‘w1 w2 w3 … wk’ is also accepted by this new NFA.

[Think: Why are these two new states needed? Why cannot we add a
direct epsilon arrow from q0 to one of the old accepting states?
Hint: If there is a self-loop on some symbol on the old initial state or on
the old accepting states, it may lead to trouble. For example, if we had a
self-loop on 0 on q0, then when we add an epsilon arrow from q0 to an
accepting state, the new NFA will accept all strings like 0, 00, 000 etc.]

Induction step 2: Assume there are NFAs for regular expressions A and B. How
to construct an NFA for regular expression A + B? See figure.

NFA for the expression A + B


Here we introduced a new initial state and added epsilon arrows from this new
state to the initial states of A and B. It is easy to see that the language of this
newly constructed NFA is L(A) U L(B).

In a similar way, we can construct NFA for the expression A ◦ B as follows.

NFA for A ◦ B

You might also like