Professional Documents
Culture Documents
The language accepted by finite automata can be simply defined by simple expressions
known as Regular Expressions.
A regular expression is a set of patterns that can match a character or a string. The
string searching algorithm used this pattern to discover the operations on a string.
Regular expressions are a combination of input symbols and language operators such
as union, concatenation and closure and are used to describe tokens of a language.
The grammar defined by the regular expression is known as regular grammar, and the
language is known as regular language.
The repetition and alternation in any string are expressed using *, +, and |.
● In any regular expression, a* means a can occur zero or more times. It can
generate (e, aa, aaa, aaaa …).
● In any regular expression, a+ means a can occur one or more times. It can
generate (a, aa, aaa, aaaa …).
● x? means at most one occurrence of x i.e., it can generate either {x} or {e}.
X U Y = {a | a is in X or a is in Y}
● Concatenation: If X and Y are regular expressions, their intersection is also an
intersection.
X ? Y = {ap | a is in X and p is in Y}
● Kleene closure: If X is a regular language, its Kleene closure X1* will also be a
regular language.
● | (pipe sign) is also left-associative with the lowest precedence amongst all of
them.
Digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
Sign = [ + | - ]
Representing language tokens using regular expressions
Decimal = (sign)?(digit)+