Professional Documents
Culture Documents
What Is A Language?: Alphabet
What Is A Language?: Alphabet
Examples:
Alphabet: A-Z Language: English
Alphabet: ASCII Language: C++
2
Suppose = {a,b,c}. Some languages
over could be:
{aa,ab,ac,bb,bc,cc}
{ab,abc,abcc,abccc,. . .}
{ }
{}
{a,b,c,
…
3
What is a language? (cont’d)
Alphabet Languages
{0,1} {0,10,100,1000,10000 }
, ,
{0,1,00,11,000,111, }
{a,b,c} {abc,Aabbcc,Aaab,bbccc}
4
Regular Languages
5
Regular Expressions
6
Regular Expressions (cont’d)
7
Regular Expressions (cont’d)
8
Rules
fix alphabet Σ
is a regular exp. (denotes the language {})
If a is in Σ , a is a regular expression (that denotes the
language {a}
if r and s are regular exps. denoting L(r) and L(s)
respectively, then so are:
(r) | (s) is a regular expression ( denotes the language L(r) L(s)
(r)(s) is a regular expression ( denotes the language L(r)L(s) )
(r)* is a regular expression (denotes the language ( L(r)* )
9
Example
L = {A, B, C, D } D = {1, 2, 3}
A|B|C|D =L
(A | B | C | D ) (A | B | C | D ) = L2
(A | B | C | D )* = L*
(A | B | C | D ) ((A | B | C | D ) | ( 1 | 2 | 3 )) = L (L D)
10
Regular Expression Operation
11
Regular Expression Operation
12
Examples
If = {a,b}
(a | b)*b
b(a1b)*
13
Regular Expression Overview
Expression Meaning
Empty pattern
a Any pattern represented by ‘a’
ab Strings with pattern ‘a’ followed by ‘b’
a|b Strings consisting of pattern ‘a’ or ‘b’
a* Zero or more occurrences of patterns in ‘a’
a+ One or more occurrences of patterns in ‘a’
a3 Patterns in ‘a’ repeated exactly 3 times
14
A regular expression R describes a set of
strings of characters denoted L(R)
L(R) = the language defined by R
– L(abc) = { abc }
– L(hello|goodbye) = { hello, goodbye }
– L(1(0|1)*) = all binary numbers that start with a 1
Each token can be defined using a regular
expression
15
RE Notational Shorthand
16
Regular Expression, R Strings in L(R)
– a – “a”
– “ab”
– ab
– “a”, “b”
– a|b
– “”, “ab”, “abab”, ...
– (ab)* – “ab”, “b”
– (a| )b – “0”, “1”, “2”, ...
– digit = [0-9] – “8”, “412”, ...
– posint = digit+ – “23”, “34”, ...
– [0-9]+ (|(.[0-9]+)) – “ “12”, “1.056”, ...
17
More Examples
18
Defining Our Language
digit = “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”
integer = {digit}+
Note that we can abbreviate ranges using the dash (“-”). Thus,
digit = 0-9
Relation = ‘<’ | ‘<=’ | ‘>’ | ‘>=’ | ‘<>’ | ‘=’
Floating point numbers are not much more complicated:
21
Real-world example
What is the regular expression that defines all phone
numbers in US?
∑ = { 0-9 }
Area = {digit}3
Exchange = {digit}3
Local = {digit}4