You are on page 1of 6

Theory of Computation (With Automata Theory)

TOPIC TITLE: Non-Regular Languages


Specific Objectives:
At the end of the topic session, the students are expected to:
Cognitive:
1. Discuss the difficulty of proving the non-regularity of languages.
2. Explain the special property of regular languages that can be
used to show if a language is not regular.
Affective:
1. Listen to others with respect.
2. Participate in class discussions actively.

MATERIALS/EQUIPMENT:
o
o

Topic slides
OHP

TOPIC PREPARATION:
o
o
o
o

Non-Regular Languages

Have the students review related topics that were discussed in


previous courses.
Prepare the slides to be presented in class.
It is imperative for the instructor to incorporate various kinds of
teaching strategies while discussing the suggested topics.
Prepare additional examples on the topic to be presented.

*Property of STI
Page 1 of 6

Theory of Computation (With Automata Theory)

Non-Regular Languages
Page 1 of 10

Non-Regular Languages
Non-Regular Languages Versus Regular Languages
The previous lessons have shown that a language can be proven to be regular by
showing that there is a finite automaton (deterministic or nondeterministic) that
recognizes it, or by showing that there is a regular expression that describes it.
However, not every language is regular.
It is easy to conclude at this point in time that a language can be proven to be nonregular by showing that there is no finite automaton that can recognize it, or by
showing that there is no regular expression that can represent it.
However, this is not the case. This lesson will present the difficulty in proving the
non-regularity of languages.

Non-Regular Languages
Page 2 of 10

x x

As an initial example, lets try to prove that the language L1 = {0 1


regular.

x > 0} is

Recall that R is a notation used to denote k concatenations of R. For example, if


1
2
3
0
R = 0, then R = 0, R = 00, R = 000. Obviously, R = .
Language L1 is then the set of all strings that starts with x number of consecutive
0s followed by x number of consecutive 1s. In other words, this language is
composed of all strings wherein the total number of consecutive 0s is equal to the
total number of consecutive 1s following it. Sample strings of this language are 01,
0011, 000111, and 00001111.
If L1 is regular, then there is a finite automaton that recognizes it. The finite
automaton for this language should be able to count and remember the number of
consecutive 0s it has received at any point in time. Then, it should be able to count
and remember the number of consecutive 1s it will be receiving. The final step is
to compare the number of consecutive 0s against the number of consecutive 1s.
However, that would require an infinite number of states since the number of
possible 0s and 1s is infinite. There could be an unlimited of number possibilities.
It is possible for the string to have an infinite number of consecutive 0s followed by
an infinite number of consecutive 1s. Moreover, it was emphasized before that a
finite automaton has a finite number of states only. It is very difficult to use a finite
automaton to count an unlimited number of things or events.
It can therefore be concluded that there is no finite automaton that can recognize
L1 since there are only a limited number of states. Language L1 is therefore not
regular.

Non-Regular Languages

*Property of STI
Page 2 of 6

Theory of Computation (With Automata Theory)

Non-Regular Languages
Page 3 of 10

As another example, try to prove that the language L2 = {w w has an equal


number of 0s and 1s} is regular.
Like language L1, the finite automaton required to recognize L2 has to count and
remember the number of 0s and 1s it has received which can be infinite. A DFA
with only a limited number of states cannot accomplish this.
Hence, there is no finite automaton that can recognize language L2, and it is also
not regular.

Now consider the language L3 = {w w has an equal number of occurrences of 01


and 10 as substrings}. From languages L1 and L2, it can be intuitively concluded
that there are also unlimited possibilities in language L3 since the finite automaton
required to recognize it also has to count and remember the occurrences of 10s
and 01s. Since it seems that there is no finite automaton that can recognize L3
then it is not regular.
However, there is a DFA that can recognize L3 as shown below:
0

1
1

1
1

Therefore, language L3 is regular.


Note that in the case of L3, a series of 01s cannot occur without intervening 10s,
and vice versa. For example, a second 01 will always form a 10, as in
01111101. Hence, at most only one unmatched pattern (01 or 10) can exist at
any point in time.
Non-Regular Languages
Page 4 of 10

It is therefore not sufficient to rely on intuition in determining if a language is not


regular. It is not automatic that if a language seems to have infinite possibilities,
then it is a non-regular language.
Hence, it is more difficult to prove the non-regularity of languages than to prove
their regularity. It is necessary to have some formal way of proving that a language
is non-regular.
To prove the non-regularity of languages, it is appropriate to discuss first one
special property of regular languages that is based on the pigeonhole principle.
The pigeon principle states that if there are m pigeonholes and n pigeons where
n > m, there will be at least one pigeonhole with at least two pigeons.

Non-Regular Languages

*Property of STI
Page 3 of 6

Theory of Computation (With Automata Theory)

Non-Regular Languages
Page 5 of 10

To show how the pigeonhole principle is applied to regular languages, consider the
4-state DFA shown below:
1

q0

0, 1

q1

q2

q3

Now assume that there is the input string 011 whose length is 3. If the DFA is
currently at the start state q0, the sequence of states that will be traversed during
the computation are
0
q0

1
q2

q1

q3

The DFA goes from state q0 to q1 to q2 to q3. Hence, for the given input string
whose length is three, four states were traversed, and no state was repeated
during the computation.
Non-Regular Languages
Page 6 of 10

Now consider the input string 0011 whose length is 4. If the DFA is currently at the
start state q0, the sequence of states that will be traversed during the computation
are
0
q0

0
q1

1
q1

1
q2

q3

The DFA goes from state q0 to q1, then back to q1 again, then to q2, and finally to
q3. For the given input string whose length is four, five states were traversed
during the computation.
Take note that although there are only four states in the DFA, five states were
traversed. The pigeonhole principle can be applied here where the DFA states are
the pigeonholes and the states that were traversed during the computation are the
pigeons. Since there are more pigeons than pigeonholes, there is at least one
DFA state that will be traversed at least twice (a repeated state). In this example,
state q1 was traversed twice, or was repeated.

Non-Regular Languages

*Property of STI
Page 4 of 6

Theory of Computation (With Automata Theory)

Non-Regular Languages
Page 7 of 10

Next, consider the input string 01011 whose length is 5. If the DFA is currently at
the start state q0, the sequence of states that will be traversed during the
computation is
0
q0

1
q1

0
q2

1
q3

1
q2

q3

The DFA goes from state q0 to q1 to q2 to q3 to q2 to q3. For the given input string
whose length is five, six states were traversed.
Although there are only four states in the DFA, six states were traversed.
Following the pigeonhole principle again, at least one DFA state will be traversed at
least twice. In this example, states q2 and q3 were traversed twice or were
repeated.
Notice that the number of states traversed will always be equal to the length of the
input string plus one. This is because even before the input string arrives, the DFA
is already at the start state q0. Hence, if the length of the input string is greater or
equal to the number of DFA states, the number of states traversed will always be
greater than the number of states of the DFA. By the pigeonhole principle, there
will then be at least one state that will be traversed at least twice.
Non-Regular Languages
Page 8 of 10

In general, if the length of the input string is greater than or equal to the number of
states of the DFA, there will be at least one state that will be repeated. It is
important to mention that short strings (strings whose length is less than the
number of states in the DFA) do not guarantee that there will be no repeated
states. There may be repeated states even though the string is short depending
on the type of input string. However, strings whose length are greater than or
equal to the number of states in the DFA do guarantee that there will be repeated
states.

Observe also that the repeated state or states will always form a loop.
In the third example, when the input string was 01011, the loop started with the
DFA at state q2. From there, it moved back to state q1 when it received a 0. And
from state q1, it went back to state q2 when it received a 1.
Looking at the given input string 01011, the loop was caused by the second 01
substring. Specifically,
0

1
this part
caused the
loop

Non-Regular Languages

*Property of STI
Page 5 of 6

Theory of Computation (With Automata Theory)

Non-Regular Languages
Page 9 of 10

Notice now that if the substring 01 that formed the loop is repeated any number of
times, the input string is still accepted (it is still a member of the language).
Hence, the input strings 0101011 and 01010011 are still accepted by the DFA.
All regular languages have this property that if they have strings that are of a
certain minimum length (greater than the number of states in the DFA that
recognizes them), then they have a substring that can be repeated an arbitrary
number of times, and the resulting strings will still be in the language.

Non-Regular Languages
Page 10 of 10

Repeating a part of a string any number of times is called pumping the string.
This property of regular languages can now be used to determine if a language is
non-regular. All that has to be done is to show that they do not have this property.
This property is formalized by a lemma called the pumping lemma and it will be
discussed in the next lesson.

[Non-Regular Languages, Pages 110 of 10]

Non-Regular Languages

*Property of STI
Page 6 of 6