You are on page 1of 4

Regular Expression:

The language accepted by finite automata can be simply defined by simple expressions
known as Regular Expressions.
A regular expression is a set of patterns that can match a character or a string. The
string searching algorithm used this pattern to discover the operations on a string.
Regular expressions are a combination of input symbols and language operators such
as union, concatenation and closure and are used to describe tokens of a language.
The grammar defined by the regular expression is known as regular grammar, and the
language is known as regular language.

The repetition and alternation in any string are expressed using *, +, and |.

● In any regular expression, a* means a can occur zero or more times. It can
generate (e, aa, aaa, aaaa …).
● In any regular expression, a+ means a can occur one or more times. It can
generate (a, aa, aaa, aaaa …).

● x? means at most one occurrence of x i.e., it can generate either {x} or {e}.

A RE is a string that can be formed according to the following rules:

● ø is a RE, where ø is an empty set.


● ε is a RE. where ε is an empty string.
● Every element in ∑ is a RE , ∑ is a set of alphabets.
● Given two REs α and β which belong to the set A and B respectively, αβ is a RE
that denote respectively the set AB
● Given two REs α and β, α + β or α | β is a RE that denote the set A U B
● Given a RE α, α* is a RE.
● Given a RE α, α+ is a RE.
● Given a RE α, (α) is a RE

Operation on Regular Language:

● Union: If X and Y are regular expressions, L union M is also union.

X U Y = {a | a is in X or a is in Y}
● Concatenation: If X and Y are regular expressions, their intersection is also an
intersection.

X ? Y = {ap | a is in X and p is in Y}

● Kleene closure: If X is a regular language, its Kleene closure X1* will also be a
regular language.

X* = the language L can occur zero or more times.

What are the different notations under regular expressions?


If r and s are regular expressions denoting the languages L(r) and L(s), then

● Union : (r)|(s) is a regular expression denoting L(r) U L(s)


● Concatenation : (r)(s) is a regular expression denoting L(r)L(s)
● Kleene closure : (r)* is a regular expression denoting (L(r))*
● (r) is a regular expression denoting L(r)

Precedence and Associativity:


● Unary operator * is left-associative and with the highest precedence.

● Concatenation is the left-associative and has the second-highest precedence.

● | (pipe sign) is also left-associative with the lowest precedence amongst all of

them.

Representing occurrence of symbols using regular expressions


Letter = [a – z] or [A – Z]

Digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]

Sign = [ + | - ]
Representing language tokens using regular expressions
Decimal = (sign)?(digit)+

Identifier = (letter)(letter | digit)*


Transition Diagram:

A transition diagram or state transition diagram is a directed graph which can be


constructed as follows:

● There is a node for each state in Q, which is represented by the circle.

● There is a directed edge from node q to node p labeled a if δ(q, a) = p.

● In the start state, there is an arrow with no source.

● Accepting states or final states are indicated by a double circle.

Some Notations that are used in the transition diagram:

You might also like