You are on page 1of 33

Introduction to Automata and

Formal Languages
Dr. Osama Fathy
Lecture 1

Introduction
Overview
• Important information
• Language
• Examples
• Terminology
• Other things we will cover in the course.
Text Books
Prescribed Text:
Daniel I.A. Cohen, “Introduction to Computer
Theory”, Second Edition, John Wiley & Sons, 1997

Recommended Reading:
– John E. Hopcroft, Rajeev Motwani, and Jeffrey
Ullman, “Introduction to Automata Theory
Languages and Computation”, 2nd Ed., Addison
Wesely, 2001
Theory of Computer – ToC (Computation Theory)

Items that are central to types of tasks (algorithms or


programs) that can be performed, not the mechanical
nature of the physical computer it self.

Algorithm

is a set of easy to follow instructions that construct


proofs and solve a problem in a finite time.
Usages of TOC Theory
•Circuits and switching theory (Computer logic)
•Instruction sets and registers arrangements
(Computer Architecture)
•Data structures and algorithms
•Operating systems
•Compilers design
•Artificial Intelligence
Theory of formal languages

Formal refer to the fact that all rules for a language are
explicitly stated in terms of:
what strings of symbols can occur.
No liberties are tolerated, and
no reference to any deep understanding is required.

e.g. car  for formal languages :


not mean a vehicle but it is a three symbols followed
by each other c then a then r
Languages

• Natural Languages
– English, Chinese, French, etc.
• Programming Languages
– C, MIPS, BASIC, Fortran, Pascal, etc.
• Mathematics
• State Diagrams
• Music
Components

• Alphabet
– Basic elements
• Rules
– Grammar
– Tell you what words belong to the language
– Syntax
• Meaning
– semantics
Definitions
Alphabet (∑)
A finite set of fundamental units out of which we
build a structure (word or string).

Language (L)
Certain specified set of strings of characters from an
alphabet.

Word (w)
Is a string that permissible in a language.
Empty String or null string (Λ)
Is a string with no letters.

Empty language (Φ)


A language with no strings | words.

Two words are the same if:


All their letters are the same and in the same order
Machine:
A formal description of a “computer”.
– Based on states & transitions between states.
– Computes some output from input.

Grammar:
Rules for deriving & parsing strings in some language.
Notes:
A language has three entities: letters, words and sentences.

fact: Group of letters  word


Group of words  sentence
but:
Not all collecting letters form a valid word
Not all collecting words form a valid sentence
Not all collecting sentences form a valid coherent
paragraph……
Notes:

•Λ is not a word of language Φ

•If language L not contains Λ and we wish to add it


to L, we use the union (+) operation to form:
L + {Λ}

•L ≠ L + {Λ}
•L + Φ = L
English-Words
• Alphabet
a b c .. z A B .. Z ’ -
• Words
– all the words in a standard dictionary
– e.g.
don’t context-free regular …
English-Sentences

• Alphabet
– English words
– punctuation marks
?!,‘’;:
– blank space
• Words
– sentences
What is on the exam?
The quick brown fox jumped over the lazy dog.
C-Language

• Alphabet
– ASCII characters
• Words
– programs

#include<stdio.h>
int main()
{
printf (“Hello\n”);
}
Languages
Two examples

English-Words English-Sentences
alphabet Σ ={a,b,c,d,…} Σ =words in dictionary +
space + punctuation marks

letter letter word


word word sentence
language all the words in all English sentences
the dictionary
Conventions

• Alphabet
ab
• Words
a b ab abbb bab …
• Notation
a2 means aa
b3 means bbb
• Empty Word Λ
• Empty Language φ
Languages
Example: Σ ={x}
• L1={x, xx, xxx, xxxx,…} or L1={xn | n=1, 2, 3,…}
Σ∗ = {Λ, x, xx, xxx, xxxx,…} ={xn | n=0, 1, 2, 3,…}
We denote x0 = Λ

L2 = {w in Σ∗: w has an odd number of characters}


= {x, xxx, xxxxx, … }
= {xn | n=1, 3, 5, … }
Languages
• Operations on Words:

Length – number of letters in a word


length(xxxxx) = 5
length(1025)=4
length(Λ)=0

Σ= {0, 1}
L1 = set of all words in Σ∗ starting with 1 and with length
at most three
= {1, 10, 11, 101, 100, 110, 111}
Languages

Reverse: is the word but spelled reversely


reverse(xxx)=xxx
reverse(157)=751
reverse(acb)=bca
Languages
Concatenation of two words – two words written down side
by side. A new word is formed.
u = xx v = xxx uv = xxxxx
u = abb v = aa uv = abbaa

u = u1u2…um v = v1v2 … vn
uv = u1u2…um v1v2 … vn

factor – one of the words in a concatenation


Property: length(uv) = length(u) + length(v)
Languages

• Concatenation of two languages:


The concatenation of two languages L1 and L2, L1L2, is the set of all
words which are a concatenation of a word in L1 with a word in L2.
L1L2 = {uv: u is in L1 and v is in L2}

Example: Σ={0, 1}
L1 = {u in Σ∗: the number of zeros in u is even}
L2 = {u in Σ∗: u starts with a 0 and all the remaining characters are 1’s}
L1L2 = {u in Σ∗: the number of zeros in u is odd}
Languages

• closure of an alphabet Σ, Kleene star *


Given an alphabet Σ, the closure of Σ (or Kleene star), denoted
Σ*, is the language containing all words made up of finite
sequences of letters from Σ, including the empty string Λ.
• Examples:
Σ = {x} Σ* = {Λ, x, xx, xxx, …}
Σ = {0, 1} Σ* = {Λ, 0, 1, 00, 01, 10, 11, 000, 001, …}
Σ = {a, b, c} Σ* = ?
Languages

• closure or Kleene star of a set of words (a language)


It is a generalization of Σ∗ to a more general set of words.

Let Σ be an alphabet and let L be a set of words on Σ.


L* is the language formed by concatenating words from L,
including the empty string Λ.
L* ={u in Σ∗: u= u1u2…um , where u1, u2, …, um are in L}
Languages
• Examples:
L = {a, ab}
L* = {Λ, a, aa, ab, aaa, aab, aba, aaaa, …}
abaaababa ∈ L* (ab|a|a|ab|ab|a factors)

L* = {Λ plus all sequences of a’s and b’s except those that


start with b and those that contain a double b}
Languages
• L = {xx, xxx}
xxxxxxx ∈ L*
xx|xx|xxx xx|xxx|xx xxx|xx|xx

L* = {Λ and all sequences of more than one x}


= {xn : x≠1}
If L = φ then L* = {Λ}

• The Kleene closure L*, of a language L, always produces an


infinite language unless L is empty or L={Λ}.
Languages
• Definition: L+, Σ+
L+ = language with all concatenations that contain at least
1 word from L
1 letter from Σ
(L+ = L* without Λ)

If Λ is a member of L, L* = L+. Otherwise L* = L+ + {Λ}.

• Examples:
Σ = {x} Σ+ = {x, xx, xxx, …}
S = {aa, bbb, Λ} S+ = {aa, bbb, Λ, aaaa, aabbb, …}
( aΛ = a)
Languages
• Example:
S = {a, b, ab} T = {a, b, bb}

S* = T* although S ≠ T
ab|a|a|ab|ab|a
a|b|a|a|a|b|a|b|a
EVEN-EVEN-Language

• All the strings that contain an even number of a’s and an


even number of b’s.

• e.g.
Λ aa bb aaaa aabb abab abba baab
DOUBLEWORD-Language
• All the strings are formed by two copies of a string joined
together.
• {ss : s is a string of a and b}

• e.g.
Λ aa bb aaaa abab baba bbbb
PALINDROME Language
•All the strings which are the same if they are spelt backwards

Example: PALINDROME
Σ={a, b}
PALINDROME:={Λ and w in Σ∗| reverse(w) = w}
= {Λ, a, b, aa, bb, aaa, aba, bbb, bab, … }

You might also like