REGULAR LANGUAGES
NAWAL DANDEKAR
REGULAR EXPRESSION IN TOC
Regular Expression in TOC is a powerful mathematical tool used to define regular languages, which are
recognized by Finite Automata. They are similar to arithmetic, logic, and Boolean expressions in representation
but different in their operation and purpose.
Regular expressions generate sets of strings (finite or infinite in number) that always represent a unique and
specific regular language.
Each of these regular languages can be recognized and accepted by a finite automata.
The main uses of regular expressions in TOC are
Pattern Recognition: Represent patterns that can be recognized by finite automata (DFA and NFA).
Compiler Design: Used in lexical analysis to tokenize input strings during compilation.
Automata Conversion: Enable conversion between regular expressions and finite automata for language
recognition.
Theoretical Analysis: Help analyze and prove properties of languages within formal language theory.
REGULAR EXPRESSION IN TOC
OPERATIONS ON REGULAR LANGUAGES IN AUTOMATA
If regular expression (R) is equal to Epsilon (ε), then the language of Regular expression (R) will represent the epsilon
set, i.e. { ε}. A mathematical equation is given below,
If regular expression (R) is equal to Φ, then the language of Regular expression (R) will represent the empty set, i.e. { }.
The mathematical equation is given below,
If regular expression (R) is equal to an input symbol “a,” which belongs to sigma, then the language of Regular expression
(R) will represent the set which has “a” alphabet, i.e. {a}. A mathematical equation is given below,
The union of two Two Regular Expressions will always produce a regular language. Suppose R1 and R2 are two regular
expressions. IF R1= a, R2=b then R1 U R2 =a+b So L(R1 U R2) = {a,b}, still string “a,b” is a regular language.
OPERATIONS ON REGULAR LANGUAGES IN AUTOMATA
Concatenation of two Two Regular Expressions will always produce a regular language.
IF R1= a, R2=b then R1.R2 =a.b
So L(R1.R2) = {ab}, still string “ab” is a regular language
Hence, the above equation shows {ab} is also a regular language.
Kleene closure of Regular Expression (RE) is also a regular language
If R1 = x and (R1)* is still a regular language
In a regular expression, x* means zero or more occurrences of x. It can generate { ε, x, xx, xxx, xxxx, …..}
In a regular expression, x+means one or more occurrence of x. It can generate {x, xx, xxx, xxxx, …..}
If R is regular, (R) is also a regular language
CONVERSION BETWEEN RE AND FA
Expression: a + b (a union b)
If you have an expression of the form "A + B", you need two states. Let's call them state A and state B. The state A
will transition to state B upon receiving input 'a'. Similarly, state A will transition to state B upon receiving input
‘b’.
Alternatively, you can represent it using one transition line for both inputs. This means that the state will
transition to the next state upon receiving either input 'a' or input 'b'.
CONVERSION BETWEEN RE AND FA
Expression: ab (a followed by b)
When dealing with expressions of the form AB, you need three states: A, B, and C. State A transitions to state B
on input 'a', and state B transitions to state C on input 'b'.
Unlike the previous case, where the transition to the next state could be triggered by either input, here the
transitions are sequential. State A will transition to state B only on input 'a', and state B will transition to state C
only on input 'b'.
So, for expressions of this form, you need separate transitions for each input −
State A transitions to state B on input 'a'.
State B transitions to state C on input 'b'.
CONVERSION BETWEEN RE AND FA
Expression: a* (closure of a)
For expressions involving the closure of A, represented as A*, the representation is straightforward.You create a
state that loops back to itself on input 'a'. This means the state can accept any number of 'a's, including zero.
To summarize, for A* − State A transitions to itself on input 'a', allowing any number of 'a’s.
These are three essential rules to remember when designing finite automata from given regular expressions.
CLOSURE PROPERTIES OF REGULAR LANGUAGES
Closure Properties In Regular Languages Are Nothing But An Operation That Is Performed On A Language, And
Then The New Resulting Language Will Be Of The Same “Type” As The Original Language.
Thus, If Closure Operations Are Performed On Some Regular Languages,Then The Result Will Also Be A Regular
Language.
If The Resultant Language After The Closure Operation Is Still Regular, Then It Holds The Closure Property Under
That Operation.
CLOSURE PROPERTIES OF REGULAR LANGUAGES
Kleene Closure
Let R is a regular expression whose language is L. Now, apply the Kleene closure on the given regular expression
and language. So, R* is a regular expression whose language will become L*.
Example
If R= (a), its language will be L= {a}. Now apply Kleene Closure on a given regular expression and language. If R* =
(a)*, then its language will be L* = {e, a, aa, aaa, aaaa….}
So, L* is still a regular language. Thus, Kleene Closure is satisfied.
CLOSURE PROPERTIES OF REGULAR LANGUAGES
Positive Closure
R is a regular expression whose language is L. R+ is a regular expression whose language is L+.
Example
If R= (a), its language will be L= {a}. Now apply positive Closure on a given regular expression and language. If R+
= (a) +, then its language will be L+ = {a, aa, aaa, aaaa….}
So, L+ is still a regular language. Thus, positive closure is satisfied.
CLOSURE PROPERTIES OF REGULAR LANGUAGES- COMPLEMENT
The complement of a language L is (Σ* – L). Sigma (Σ) holds the input symbols for generating the language. So, the
complement of a regular language is always regular.
CLOSURE PROPERTIES OF REGULAR LANGUAGES-UNION
Let L1 and L2 be the languages of regular expressions R1 and R2, respectively. Then R1+R2 (R1 U R2) is ALSO a
regular expression whose language is L3 = (R1 U R2). L3 also belongs to the regular language.
CLOSURE PROPERTIES OF REGULAR LANGUAGES-
CONCATENATION
Let L1 and L2 be the languages of regular expressions R1 and R2, respectively. Then R1. R2 is also a regular
expression whose language is L3 = (R1. R2). L3 also belongs to the regular language
CLOSURE PROPERTIES OF REGULAR LANGUAGES-INTERSECTION
Let L1 and L2 be the languages of regular expressions R1 and R2, respectively, then it is a regular expression
whose language is L1 intersection L2.
DFA TO REGULAR EXPRESSION | ARDEN’S THEOREM
Arden’s Theorem-Arden’s Theorem is popularly used to convert a given DFA to its regular expression.
It states that-- Let P and Q be two regular expressions over ∑. If P does not contain a null string ∈, then-
R = Q + RP has a unique solution i.e. R = QP*
Conditions-
To use Arden’s Theorem, following conditions must be satisfied-
The transition diagram must not have any ∈ transitions.
There must be only a single initial state.
DFA TO REGULAR EXPRESSION | ARDEN’S THEOREM
Steps-To convert a given DFA to its regular expression using Arden’s Theorem, following steps are followed-
Step-01:Form a equation for each state considering the transitions which comes towards that state. Add ‘∈’ in the
equation of initial state.
Step-02: Bring final state in the form R = Q + RP to get the required regular expression.
Note-01: Arden’s Theorem can be used to find a regular expression for both DFA and NFA.
Note-02: If there exists multiple final states, then-
Write a regular expression for each final state separately.
Add all the regular expressions to get the final regular expression.
PUMPING LEMMA FOR REGULAR LANGUAGES
Pumping Lemma is applied to infinite languages to show that languages are not regular. It should never be used to
prove that a language is regular. That’s why the pumping lemma is also called a negative test.
A simple description of the pumping lemma is given in the following diagram.
IMPORTANT NOTE
Keep in mind that all finite languages are regular, and there is no need to check whether they are regular or
not, but all infinite languages may or may not be Regular.
For example, as a^n, b^n is an infinite language but not a Regular language. So, the Pumping Lemma Test is
important to check whether infinite languages are regular or not.
The pumping lemma gives the result of the failure of any condition. That’s why it is also called a negative test.
We can say,
• If an infinite Language (L) does not satisfy all three conditions of the Pumping Lemma, then it is not a regular
language.
• If given an infinite Language that satisfies all three conditions of the Pumping Lemma, then it may or may not
be a regular language
PUMPING LEMMA CONDITIONS FOR REGULAR LANGUAGES
If L is a regular language, then there exists a constant p (called the pumping length) such that every string w in L
with length at least p can be divided into three parts w = xyz Such that
1.|y| ≥ 1
(y is not empty — it contributes something to the string)
2.|xy| ≤ p
(The repetition part occurs within the first p characters)
3.For all i ≥ 0, the string xyiz ∈ L
(i.e., you can “pump” y — repeat it any number of times — and the string stays in the language)
PUMPING LEMMA ALGORITHM FOR REGULAR LANGUAGE
EXAMPLE 1
Using Pumping Lemma prove that the language A = { anb2n | n ≥ 0 } is not Regular
EXAMPLE 1 - EXPLANATION
A simple, short description of the given example to prove that the language A = { anb2n | n ≥ 0 } is not Regular is
given below, by using different “p” and “i” values
• We want to prove that A = { aⁿb²ⁿ | n ≥ 0 } is not regular.
• Assume A is regular and let the pumping length be p = 3.
• Choose a string S = aaabbbbbb from A (n = 3 → 3 a’s and 6 b’s).
• Split S into x = aa, y = a, z = bbbbbb, where |xy| ≤ p and |y| > 0.
• Pump y: xy²z = aaaabbbbbb, which has 4 a’s and 6 b’s → not in A (should be a⁴b⁸).
• So, the string after pumping is not in the language → A is not regular
EXAMPLE 2
Using Pumping Lemma prove that the language A = { anbn | n ≥ 0 } is not Regular
EXAMPLE 2 EXPLANATION
A straightforward, short description of the given example to prove that the language A = { anbn | n ≥ 0 } is not
Regular is given below, by using different “p” and “i” values
• We want to prove that A = { aⁿbⁿ | n ≥ 0 } is not regular.
• Assume A is regular and let the pumping length be p = 3.
• Choose a string S = aaabbb from A (n = 3).
• Split S into x = aa, y = a, z = bbb, where |xy| ≤ p and |y| > 0.
• Pump y: xy²z = aaaabbb, which has 4 a’s and 3 b’s → not in A.
• So, the string after pumping is not in the language → A is not regular