Professional Documents
Culture Documents
Basic concepts
Alphabet Words
{a, b, . . . , z} man, abc, . . .
{0, 1} 000, 010101, . . .
{#, $, a, b, c} #cb$, $$$, . . .
1
Operations on words
w = a1 · · · an ⇒ w R = an · · · a1
– Inductive definition:
(1) eR = e
(2) (au)R = uR a, where a ∈ Σ, u ∈ Σ∗
• Power concatenates n copies of w to form a new
word n
n z }| {
w = ww · · · w
– Inductive definition:
(1) w0 = e
(2) wi+1 = wi w, for any i ≥ 0.
2
Theorem
(uw)R = wR uR , where u, w ∈ Σ∗
Proof: Prove by induction on |u|.
• Basis step: |u| = 0, i.e. u = e.
(ew)R = wR = wR e = wR eR
(uw)R = ((av)w)R
= (a(vw))R Associative law
= (vw)R a Rule 2 of ind. definition
= wR v R a Induction hypothesis
= wR (av)R Rule 2 of ind. definition
= w R uR .
3
Languages
Examples:
1. Set of all English words — a language over {a, b, . . . , z}.
2. {01, 0101, 010101, . . . } — a language over {0, 1}.
3. {e} — a language over any alphabet.
4
Operations on languages
• Concatenation:
L1L2 = {xy | x ∈ L1, y ∈ L2}
• Reversal:
LR = {wR | w ∈ L}
• Power:
Ln = {w1w2 · · ·wn | w1 , w2, · · · , wn ∈ L}
Inductive definition:
1. L0 = {e}.
2. Ln+1 = LnL, n ≥ 0.
• Kleene star:
L∗ = {w ∈ Σ∗ | w = w1w2 · · · wk for some k ≥ 0
and some w1, . . . , wk ∈ L}
It is also called the reflexive transitive closure of L
under concatenation.
• Plus:
L+ = LL∗
It is also called the transitive closure of L under
concatenation.
5
Operations on languages
Examples:
Let Σ = {a, b}, L1 = {a, ab}, and L2 = {e, ba}.
Then
LR
1 = {a, ba}.
L1L2 = {a, ab, aba, abba}.
L21 = L1L1 = {aa, aab, aba, abab}.
L22 = L2L2 = {e, ba, baba}.
Σ∗ = {e, a, b, aa, ab, ba, bb, aaa, · · ·}.
{e}1000 =?
L∅ =?
∅∗ =?
e 6∈ L+?
Ln ⊆ Ln+1?
Ln ⊆ L∗ ?
6
Regular expressions
Regular expressions are a finite representation of lan-
guages.
Examples:
Let Σ = {a, b, c, d}.
• Regular expressions:
a, ((a∪b)∗d), (c∗(a∪(bc∗)))∗, ∅∗
c∪∗, (∗)
7
Language represented by regular expressions
Let α denote a regular expression.
Let L(α) denote the language represented by a regular
expression α.
8
Examples:
1. Write a regular expression for each of the following
languages defined over Σ = {0, 1}:
(a) L = {w | w contains at least two zeros}
(b) L = {w | w is of even length}
(c) L = {w | w has even number of 1’s}
2. Simplify ∅∗ ∪ a∗ ∪ b∗ ∪ (a ∪ b)∗.
3. Simplify (a ∪ b)∗a(a ∪ b)∗.
4. Prove that
L[c∗(a∪(bc∗))∗] = {w ∈ {a, b, c}∗ | w does not con-
tain substring ac}
Proof:
Suppose w ∈ L[c∗(a∪(bc∗))∗]
⇒ each occurrence of a in w is either at the end of
the string, or is followed by another occurrence of a,
or is followed by an occurrence of b.
⇒ w does not have the substring ac.
9
c.
⇒ v is a sequence of a’s, b’s and c’s with any blocks
of c’s appearing only immediately after b’s, not after
a’s and not at the beginning of the string. Thus
v ∈ L((a ∪ bc∗)∗).
⇒ w ∈ L(c∗(a ∪ bc∗ )∗).
Notational simplifications:
1. A regular expression α also denotes the language
L(α) represented by α. E.g., we may write ab ∈ a∗b∗.
2. Omit extra parentheses. E.g.,
(a ∪ b) ∪ c = a ∪ (b ∪ c) = a ∪ b ∪ c
(ab)c = a(bc) = abc
a ∪ (bc) = a ∪ bc 6= (a ∪ b)c
10
Regular languages
Closure Properties:
If A and B are two regular languages. Then
the following languages are also regular
AB, A∪B, A∗, AR .
Proof:
Since A and B are regular languages, by definition, they
can be represented by some regular expressions. Let A =
L(α), B = L(β), where α, β are regular expressions.
Then we have
• AB = L(α)L(β) = L(αβ)
That is, αβ is a regular expression representing the lan-
guage AB. Thus AB is a regular language.
11
Language generators vs language recognizers
A Language generator (e.g. a regular expression) repre-
sents a language by generating the words in the language
(c∗(a∪(bc∗))∗) ⇒
{w∈{a, b, c}∗ : w does not contain substring ac}.
12
A language recognizer (e.g., an algorithm) represents a
language by recognizing its words.
Algorithm: recognizer(w)
• Input: w — a string.
• Output: YES or No.
13