You are on page 1of 22

Informal languages

 Natural languages are generally defined informally


 Human brain are capable to understand incoherent
even invalid sentences.
 You mangoes like
 We school daily go to
 Rectify grammatical errors etc.
 Resolve ambiguity
 Interpret according to context
 Supporting aids such as Facial expressions and body language
etc.
How to Communicate with
machines ?

 Need a language: what sort


 Machines don’t have human mind though may have
its partial imitation
 Would fail on incorrect or ambiguous input
 Some recovery or input corrections may be proposed
but again very limited.
 Thus need a precise, explicit and universal definition
of communication language
Summary of Languages
 Three aspects/specifications
 Lexical

 Defines valid words/units of a language

 Syntactic

 Defines rules for combining the units to form valid

sentences (computer programs in context of


machines)
 Semantic

 Concerned with the interpretation or meaning of a

sentence (what output to produce in context of


machines)
 Affected by ambiguity the most.
Formal Languages
 Word “formal” refers to the fact that all the rules
for the language are explicitly stated in terms of
what string of symbols can occur
 No ambiguities
 Universally uniform understanding
 Let the machine
 Interpret an input uniformly every time. i.e. always
produces same output for a particular input
 Avoid crashes because of ambiguity
 Explicitly reject invalid input

4
Languages
Symbols
 Symbols are an entity or individual
objects, which can be any letter,
alphabet or any picture.
Symbol is the basic building block of
TOC.
Example: Can be anything like:
a,b,c,A,B,Z,0,1,…etc

6
Formal Languages
 Need precise uniformly understandable notation
 Representations
 Alphabet are a finite set of symbols. It is denoted by
∑(sigma) to denote an alphabet
 Examples:
 Binary: ∑ = {0,1}

 All lower case letters: ∑ = {a,b,c,..z}

 Alphanumeric: ∑ = {a-z, A-Z, 0-9}

 DNA molecule letters: ∑ = {a,c,g,t}

 A certain specified set of strings of characters from


the alphabet is called the language (set of words)
7
Formal Languages
 List of words
 Set of all valid words of a given language, e.g., a

language English_Words that contains all valid words of


English would have a  = {all entries of the dictionary
+ punctuation marks and blank space}
 Denoted by 

 Is  Finite or Infinite set.

 Strings:
 Concatenation of finite symbols from the alphabets is

called a string.
 A string a finite sequence of symbols chosen from

alphabet.
 Example: if Σ ={a,b} then a, abab, aaab,
ababababa…. 8
Formal Languages
 Empty String or Null String
 Empty String is a string which does not contain any
letter. It is same as the empty set {ε}.
 It is denoted by capital Greek letter lambda ^.
 Words
 In spoken languages not all strings are words.
 Example: in English if we combine abcd, it does not
form any word.
 Words are strings belonging to some language.
 Example: if Σ={x} then a language L can be defined as,
L={xn : n=1,2,3…} OR L={x, xx, xxx, xxxx…..} Here x, xx,
xxx…. are the words of L.
 Note: Not all strings are words but all words are strings
9
Formal Languages
 String Variable:
 A letter used for denoting a string.
 uses w, x, as string variable. For example
 w = 0111100
 String Length:
 The number of positions for symbols in the
string. For simplicity we can say that it is the
number of symbols in the string. For example
 |w| = 7 ,

10
Formal Languages
 Reverse of a string
 The reverse of a string s, denoted by rev(s), is
obtained by writing the letters of s in reverse
order.
 Example 1: if s=abc is a string defined over
Σ={a,b,c} then Rev(s)= cba

11
Defining Languages
 Define alphabet set
 Define rules for forming valid words and sequences
of words from 
 Called grammar
 Can be descriptive
 Limitations of informalism
 Can be mathematical
 Can also define supporting functions e.g., length(X), reverse(x)

12
Finite vs. Infinite Languages
 Finite Languages
 Countable set of words
 Can be defined by rigorously listing the words
in 
 E.g. English_Words
 Infinite Languages
 Infinite set of valid words
 Cannot be listed completely
 E.g. English_Sentences

13
Σ* Kleene Star
 The set of strings created from any
number (0 or 1 or …) of symbols in an
alphabet  is denoted by *.
 That is, * = i=0 i
 Let  = {0, 1}.
 * = {, 0, 1, 00, 01, 10, 11, 000, 001, 010,
011, … }.

14
Kleene Closure
 Examples
 If Σ = {x} then
 Σ* = {ε, x, xx, xxx …}
 If Σ = {0 1} then
 Σ* = {ε, 0, 1, 00, 01, 10, 11, 000, 001 …}
 If Σ = {a b c} then
 Σ* = {ε, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc
…}

15
Σ+ Positive Closure
 The set of strings created from at least one symbol
(1 or 2 or …) in an alphabet  is denoted by +.
 That is, + = i=1  i
= i=0.. i - 0
=  i=0.. i - {}
 Let  = {0, 1}. + = {0, 1, 00, 01, 10, 11, 000, 001, 010,
011, … }.
* and + are infinite sets.

16
Defining Languages
 We can define the operation of
concatenation
 xn concatenated xm is the new word xn+m
 We can define a language that contain ε
 L = {ε, x, xx, xxx, xxxx}
= {xn for n = 0, 1, 2, 3, …}
 Here x0 = ε and not x0 =1

17
Languages
L is a said to be a language over alphabet ∑, only if L  ∑*
 this is because ∑* is the set of all strings (of all possible
length including 0) over the given alphabet ∑
Examples:
1. Let L be the language of all strings consisting of n 0’s
followed by n 1’s:
L = {, 01, 0011, 000111,…}
2. Let L be the language of all strings of with equal number of
0’s and 1’s:
L = {, 01, 10, 0011, 1100, 0101, 1010, 1001,…}
Canonical ordering of strings in the language

Definition: Ø denotes the Empty language


 Let L = {}; Is L=Ø?
NO
18
Defining Languages
 The language can be defined in different ways, such
as
 Descriptive definition
 Recursive definition
 Using Regular expressions (RE) and
 Using Finite automaton (FA) etc.

19
Defining Languages
 Descriptive definition
 The language is defined, describing the conditions imposed
on its words.
 Example 1: the language L of strings of odd length, defined
over Σ={a} can be written as L={a,aaa,aaaaa, …}
 Example 2: the language L of strings that does not start with
a, defined over Σ={a,b,c} can be written as
L={b,c,ba,bb,bc,ca,cb,cc,….}

20
Defining Languages
 Example 3: the language L of strings of length 2, defined
over Σ={0,1,2} can be written as
L={00,01,02,10,11,12,20,21,22}
 Example 4: the language L of strings ending in 0, defined
over Σ={0,1} can be written as L={0,00,10,000,010,100,110,
…}
 Example 5: the language EQUAL, of strings with number of
a’s equal to number of b’s, defined over Σ*={a,b} can be
written as L={ε,ab,aabb,abab,baba,abba…}

21
Defining Languages
Example 6: the language EVEN-EVEN, of strings with even
number of a’s and even number of b’s, defined over Σ*={a,b}
can be written as
L={ε, aa,bb,aaaa,aabb,abab,abba,baab,baba,bbaa,bbbb,
…}
Example 7: the language {anbn}, of strings defined over
Σ={a,b}, as {anbn : n=1,2,3…}, can be written as {ab, aabb,
aaabbb,…..}
 Example 8: the language {anbnan}, of strings defined over
Σ={a,b}, as {anbnan : n=1,2,3…}, can be written as {aba, aabbaa,
aaabbbaaa,…..}
 Example 9: the language PRIME, of strings defined over Σ={a},
as {ap : p is prime}, can be written as {aa, aaa, aaaaa,…..}

22

You might also like