Professional Documents
Culture Documents
Regular Expressions
• Offer something that automata do not : a declarative way to express the strings
we want to accept.
Regular Expressions
Close related to FA
Regular Expressions
Implementations
• Recall that a lexical analyzer is the component of a compiler that breaks the
source program into logical units called tokens, of one or more characters
that have a shared significance. Examples of tokens include keywords (e.g
while), identifiers (e.g any letter followed by zero or more letters and/or
digits) and signs such as + or <=.
Regular Expressions
Implementations
Operators? Value?
Operators? Value?
The regular expression 01* + 10* denotes the language consisting of all strings
that are either a single 0 followed by any number of 1’s or a single 1 followed by
any number of 0’s.
1. Union
2. Concatenation
3. Closure
Regular Expressions
Precedence
• In arithmetic, we say that × has precedence over + to mean that when there is
a choice, we do the × operation first. Thus in 2+3×4, the 3×4 is done before
the addition. To have the addition done first, we must add parentheses to
obtain (2 + 3) × 4.
•
Regular Expressions
Formal Definitions
Building Regular Expressions
• The familiar arithmetic algebra starts with constants such as integers and real
numbers, plus variables and builds more complex expressions with arithmetic
operators such as + and x
• The algebra of regular expressions follows this pattern, using constants and
variables that denote languages and operators for the three operations (union,
dot, and star).
Building Regular Expressions
Basis
1. The constants ∈ and ∅ are regular expressions, denoting the languages {∈}
and ∅ respectively. That is, L(∈ ) = {∈}, and L(∅) = ∅.
2. L(EF) = L(E)L(F)
3. L(E*) = (L(E))*
4. L((E)) = L(E)
Example
• The basis rule for regular expressions tells us that 0 and 1 are expressions
denoting the languages {0} and {1}, respectively.
1. (0(1*)) + 1 ?
2. 0(1*+1) ?
3. (01)* +1 ?
More examples
• Do worksheets!
EQUIVALENCE WITH FINITE AUTOMATA
• Any regular expression can be converted into a
finite automaton that recognizes the language it
describes, and vice versa.
• R = a for some a ∈ Σ. Then L(R) = {a}, and the following NFA recognizes L(R).
• Note : if L = L(A) for some DFA A, then there is a regular expression R such that L =
L(R).
• Diketahi DFA
• Let us use Rij(k) as the name of a regular expression whose language is the set
of strings w such that w is the label of a path from state i to state j in A, and
that path has no intermediate node whose number is greater than k.
Note
if i=j, then the legal paths are the path of length 0 and all loops from i to itself.
• case (b) —> one symbol a —> the expression becomes ∈+a
• case (c) —> multiple symbols —> the expression becomes ∈+a1+a2+ … +ak.
From DFA’s to RE
From DFA’s to RE
From DFA’s to RE
• The final regular expression equivalent to the automaton is constructed by
taking the union of all the expressions where the first state is the start state,
and the second state is accepting.
• Consists of all strings that begin with zero or more 1’s, then have a 0, and
then any string of 0’s and 1’s.
From DFA’s to RE
By eliminating states
An NFA accepting strings that have a 1 either two or three positions from the
end
2. Eliminate state B.
4. Sum the two expressions to get the expression for the entire automaton
Example
• Tulislah notasi ekspresi reguler se-simple mungkin untuk Rij(0) dan Rij(1) pada
transisi tabel DFA berikut ini.
Example
1. R+S
2. RS
3. R*
From RE to FA
Example
3. concatenation is another
automaton for (0+1) —> create a
copy of the automaton (0+1) of
poin 1.
Exercise
1. 01*
2. (0+1)01
3. 00(0+1)*
Next Week