You are on page 1of 14

25/10/15

Language and models


Based on Chapter 8 of
Logic and language models for
computer science
by Hamburger and Richards

Programming languages and CS


Programming languages:
How do they take the forms they do
How does a computer process your programs

To gain insight:
Model language
Model mechanisms for processing it

Language models
Language is about:
Structure
Meaning

Formal models of language:

Model: strips
detail of a topic

Throw out language meaning

25/10/15

Ambiguity and language design


Natural language: loose design
Vagueness

Contrast to integers and


reals in a programming
language

E.g.: boundary of the meaning of a chair

Ambiguity
Ambiguous words: gure out meaning from context
E.g.: bank

Ambiguity at the level of Contrast


to keywords
in
certain programming language
phrases and sentences

Ambiguity at the level of sentences


E.g.:
I used a robot with one hand
old men and children

Ambiguity at the level of sentences


E.g. in a programming language or math:
log x + y

Ignore precedence rules


and parentheses

log = log2

X=Y=25

37

25/10/15

Language models
Language models provide insight:
Ambiguity
Eciency of language processing
Computa[onal complexity
Important: Eciency of
processing a programming
language

Formal languages
Formal models of languages focus on:
Sequences of symbols

Deni[on: alphabet

Strings

Finite set of symbols

Deni[on: formal language


Set of strings

From now on:


language

Alphabet and language


English:

All English words

Alphabet: huge

Programming language:
Alphabet: smaller but s[ll too big to write down

Convenience:
We use smaller symbol sets

25/10/15

Alphabet and language


E.g.:

Length of strings
Length of string:
The number of symbols in it
E.g.: |abc| = 3

Empty string
|| = 0

is a string, not
a symbol!

Not listed in

Variables names in:


Fortran, Pascal

E.g.:

Length of strings
Finite set



Innite set

25/10/15

:
Language consis[ng
of all possible strings over the
X
symbol set
X

Any language on is a subset of

E.g.:

Any language over {a, b}


Is a subset over {a,
b}

Set formats
Extensional format:

(i)

Explicit list of elements

Intensional format

(ii), (iii)

E.g.:

Exercise
Book p154: 8.8 (a)
Book p154: 8.9 (a)

25/10/15

Opera[ons on languages
L1, L2:
languages

Languages are sets:


Union: L1 [ L2
Intersec[on: L1 \ L2
Dierence: L1 \ L2
Symmetric dierence: L1 4 L2
Complement

Each language L
has a complement
\ L

Empty language
Has no strings
Empty language: ;
Language with the empty string: {}

Other example: {,
a, aa}

Has 1 string,
of length 0

Has 3 strings, of
length 0, 1, 2

Concatena[on of languages
Deni[on: Concatena[on of strings
x and y are strings: x y is the concatena[on of
x
and y
x
is a string: x = x = x
x
is a string: xx
k = xk+1
x
is a string: x0 =

Not communica[ve

25/10/15

Exercise
Book p154: 8.7 (b)

Concatena[on of languages
Deni[on: Concatena[on of languages
For language L
1 and L
2 : wrihen as L1L2
L1 L2 = {xy|x 2 L1 ^ y 2 L2 }

E.g.:

Not
communica[ve

Possible concatena[ons

L1L2:
2 [mes 3 unique concatena[ons

L2L2:
3 [mes 3 concatena[ons
Only 5 unique concatena[ons

Exercise: give
the set L2L2

25/10/15

Exercise
Book p154: 8.11
Book p154: 8.12 (a)

Exercise
Book p155: 8.18 (a)
L=

Superscript for languages


Deni[on: Superscript for languages
For each language L
:
L0 = {}
L
k+1
=
LL
k , for any integer k

25/10/15

Closure for languages


Deni[on: Closure for languages
L =

1
[

Li

Star (Kleene)
operator

i=0

E.g.:

Consistent
with

Closure for languages


For any language L (except ; or {}
)
L
has innitely many strings

Exercise
Book p154: 8.6
Book p155: 8.16 (a)

25/10/15

2 levels and 2 language classes


Human language:
Symbols: individual words

is called a vocabulary
Strings of the language: sentences

On the other hand


Each word:

is called the alphabet
String composed of lehers

Lower level of
analysis

2 levels and 2 language classes


Also for programming languages:
Strings of characters: tokens
Tokens can serve as the symbols in
E.g. in Java: for, <=, try

Other examples:
Posi[ve integer tokens:
English:

2 levels and 2 language classes


One is interested in one par[cular symbol set
at any par[cular [me
Even if that symbol set has mul[-char
symbols, these can s[ll be represented by
simple symbols
E.g.: {a,
b,
c}
or {1,
0}

10

25/10/15

2 levels and 2 language classes


Combining tokens to form statements is more
complex than combining characters to make
tokens
Language classes:
Conceptual basis

for lexical scanning

Class of regular languages

Impossible to use them for analysing how programming


statements are constructed from tokens

Class of context-free languages


Has no[on of recursive structure

No[on of recursive structure


Java statement

Inner statement
bears no restric[ons
Can be a for
statement again

Statement
within a
statement

Recursive construct
Deni[on: Recursive construct
A construct is recursive if it can contain within
itself a structure of the same type

11

25/10/15

Ques[ons of formal language theory


How is a language specied?
Extensional format:
Explicitly enumerate the elements

Intensional format:
Uses a specica[on of the proper[es the strings it
contains
Another example:

N a (x)
denotes the number
of occurrences of the
symbol a in the string x

Computa[onal ques[ons

Ques[ons of formal language theory


How can we generate the sequence of a
language?
Use a representa[on to:
Generate an unlimited number of strings
Of the innite language

Such that:
Any par[cular string would ul[mately occur

Grammar

Computa[onal ques[ons

Ques[ons of formal language theory


How can we recognize the sentences of a
language?
Given language L
and any string x over the
alphabet of the language, how can we determine
x2L
Automaton built for L

12

25/10/15

Computa[onal ques[ons

Ques[ons of formal language theory


What is the best way to specify a language?
3 ways to represent:
Intensional format
Grammar
Automaton

Correspondence among them


E.g.:
Use grammar on L
to produce an automaton that recognizes L
Translate automaton to a regular expression
Mechanical transla[ons

Language classes
There are an innite number of dierent
languages over
Some are more complex than others
Automaton is
more challenging

E.g.:

Grammar is more
challenging

Language classes
Deni[on: Language class
Set of languages that can be recognized by some
automaton that abides certain restric[ons

Parallel to this:
restric[ons to grammars can be specied

The same language classes can be dened in


terms of automata and grammars

13

25/10/15

Language classes

Sets

Regular language is a
context-free language

14

You might also like