You are on page 1of 11

2/17/2018 Book

R162217 PRINCIPLES OF
PROGRAMMING LANGUAGES
OBJECTIVES:

• To understand and describe syntax and semantics of programming


languages

• To understand data, data types and basic statements

• To understand call-return architecture and ways of implementing them

• To understand object-orientation, concurrency and event handling in


programming languages

• To develop programs in non-procedural programming paradigms

UNIT I
SYNTAX AND SEMANTICS

Evolution of programming languages - describing syntax - context free


grammars - attribute grammars - describing semantics - lexical analysis -
parsing - recursive - decent bottom - up parsing

1.1 Evolution of programming languages

A programming language is an artificial language that can be used to


control the behavior of computer. Programming languages like human
languages are defined through the use of syntactic and semantic rules to
determine structure and meaning respectively. Programming languages are
used to facilitate communication about the task of organizing and
https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 1/11
2/17/2018 Book

manipulating information and to express algorithms precisely. Some


authors restrict the term “programming language” to those languages that
can express all possible algorithms; sometimes the term “computer
language” is used for more limited artificial languages.

We should know that in the primitive computers, the programming was


such a laborious task that the vacuum-tube ON-OFF switches had to be set
by hand. The development in technology has made the programming
friendly to the developers.

Machine Language

The computer’s own binary-based language or machine language is


difficult for human beings to use. The programmer is required to input
every command and all data in binary form. Machine-language
programming is such a tedious, time-consuming task that the time saved in
running the program rarely justifies the days or weeks needed to write the
program. Machine languages are the most primitive types of the computer
language.

High Level Languages

The high level languages use the English words such as OPEN, LIST,
PRINT, which might stand for an array of instructions. These commands
are entered via keyboard or from a programme in a storage device.

Historical Landmarks

• Programming has its origin in the 19th century, when the first
“programmable” looms and player piano scrolls were developed.

• This followed the punch cards encoded data in 20th century that used to
direct the mechanical processing. In the 1930s and early 1940s lambda
calculus remained the influential in language design.

• The decade of 1940s has many landmarks to its credit in the initial
development of modern computers and programming languages.

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 2/11
2/17/2018 Book

• In the beginning of this decade, first electrically powered digital


computers were created. The first high-level programming language to be
designed for a computer was Plankalkul, developed for the German Z3 by
Konrad Zuse between 1943 and 1945.

• Programmers of early 1950s computers, notably UNIVAC I and IBM


701, used machine language programs (i.e) the first generation language
(1GL).

• Grace Hopper is credited with implementing the first commercially


oriented computer language. After programming an experimental
computer at Harvard University, she worked on the UNIVAC I and II
computers and developed a commercially usable high-level programming
language called FLOW-MATIC.

• The 1GL programming was quickly superseded by similar machine-


specific but mnemonic second generation languages (2GL) known as
assembly languages or “assembler”.

• Later in the 1950s, assembly language programming, which had evolved


to include the use of macro instructions was followed by the development
of “third generation” programming languages (3GL) such as FORTRAN,
LISP, and COBOL.

• IBM in 1957 developed a language that would simplify work involving


complicated mathematical formulas known as FORTRAN (FORmula
TRANslator).

• FORTRAN was the first comprehensive high-level programming


language that was widely used. In 1957, the Association for Computing
Machinery in the United States started the development of a universal
language that would correct some of FORTRAN’s shortcomings.

• Next year they released ALGOL (Algorithmic Language), another


scientifically oriented language. This was followed by LISP. Originally
specified in 1958,LISP is the second-oldest high-level programming
language in widespread use today; only FORTRAN is older.

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 3/11
2/17/2018 Book

• LISP is a family of computer programming languages with a long history


and a distinctive fully-parenthesized syntax. OBOL (Common Business-
Oriented Language), a commercial and business programming language,
concentrated on data organization and file-handling and is widely used
today in business.

• 3GLs are more abstract and “portable” or at least implemented similarly


on computers that do not support the same native machine code. Updated
versions of all of these 3GLs are still in general use and each has strongly
influenced the development of later languages.

• At the end of the 1950s, the language formalized as ALGOL 60 was


introduced and most later programming languages are in many respects,
descendants of ALGOL. The format and use of the early programming
languages was heavily influenced by the constraints of the interface.

• CBASIC (Beginner’s All-purpose Symbolic Instruction Code) was


developed in the early 1960s for use by non-professional computer users.

• LOGO was developed to introduce children to computers. C, a language,


Bell Laboratories designed in the 1970s, is widely used in developing
systems programs, as is its successor, C++.

• Other languages have been developed to permit programming in Internet


applications. The most popular is Java, an Object-Oriented programming
language introduced in 1995 by Sun Microsystems. Java enables the
distribution of both the data and small applications called applets.

• These applets could be transmitted over internet. The specialty of java


was that it is machine independent and can run on any kind of computer.

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 4/11
2/17/2018 Book

History Of Programming (Educational Video)

1.2 Describing Syntax

A language, whether natural (such as English) or artificial (such as Java),


is a set of strings of characters from some alphabet. The strings of a
language are called sentences or statements.

(Or)

The syntax of a language is set of rules that defines the form of a


languages. They define low expressions, sentences, statements and
program units are formed by the fundamental units known as
word/lexemes.

The general form of describing syntax:

• The syntax rules of a language specify which strings of characters from


the language’s alphabet are in the language.

• The lexemes of a programming language includes its numeric literals,


operators and special words among others. One can think of programs as
strings of lexemes.

• Token is a description of the lexems.

Example: index = 2 * count + 17;

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 5/11
2/17/2018 Book

The lexemes and tokens of this statement are

Lexemes Tokens
Index Identifier
= equal_sing
2 int_literal
* mult_op
Count identifier
+ plus_op
17 int_literal
; semicolon

Formal language of languages:

In general, languages can be formally defined in the two distinct ways


which are recognition and generation.

Language Recognizers

• A recognition device reads input strings of the language and decides


whether the input strings belong to the language.

• It only determines whether given programs are in the language.

• Example: Syntax analyzer is a part of the compiler. The syntax analyzer


also known as parsers, if it determines whether the given programs are
syntactically correct.

Language Generators

• A device that generates sentences of a language.

• One can determine, if the syntax of a particular sentence is correct by


comparing it to the structure of the generator.

1.2.1 Context free grammars

Inherently recursive structures of a programming language are defined by


a context free grammar. In a context free grammar, we have four triples G(
https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 6/11
2/17/2018 Book

V,T,P,S). Here, V is finite set of the terminals, T is a finite set of non


terminals, P is a finite set of the productions rules in the following form.

A → α where A is a non terminal and α is a string of terminals and non


terminal, S is start symbol. L(G) is the language of G (the language
generated by G) which is the set of sentences. A sentence of L(G) is a
string of terminal symbols of G. If S is the start symbol of G ⇒ ω is a
sentence of L(G) if S, where ω is a string of terminals of G. If G is the
context free grammar, L(G) is a context free language. Two grammar G1
and G2 are equivalent, if they produce same grammar. Consider
production of the form S ⇒ α, If α contains non terminals, it is called as a
sequential form of G. If α does not contain non terminals, it is referred as a
sentence of G.

A context free grammar consist of the following components:

A set of non-terminals (V):

Non terminals are syntactic variables that denote sets of strings. The non
terminals define the sets of strings that help to define the language
generated by the grammar. A set of tokens, known as terminal symbols
(Σ). Terminals are the basic symbols from which strings are formed.

A set of productions (P):

The productions of a grammar specify in which the terminals and non-


terminal can be combined to form strings. Each production consists of the
non-terminal called the left side of the production, an arrow and sequence
of tokens and/or on terminals called the right side of the production. One
of the non-terminal is designated as start symbol (S) from where the
production begins. The strings are derived from the start symbol by
repeatedly replacing a non-terminal of the right side of a production.

Describing Lists

• Syntactic lists are described using recursion.

<ident_list> → ident

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 7/11
2/17/2018 Book

| ident, <ident_list>

• A rule is recursive, if LHS appears in its RHS.

Grammars and derivations

• The sentences of the language are generated through a sequence of


applications of the rules, beginning with a special non-terminal of the
grammar called the start symbol.

• A derivation is a repeated application of rules, starting with the start


symbol and ending with a sentence.

An example grammar:

<program> → <stmts>

<stmts> → <stmt> | <stmt> ; <stmts>

<stmt> → <var> = <expr>

<var> → a | b | c | d

<expr> → <term> + <term> | <term> - <term>

<term> → <var> | const

An example derivation for a simple statement

a = b + const

<program> ⇒ <stmts> ⇒ <stmt>

⇒ <var> = <expr>

⇒ a = <expr>

⇒ a = <term> + <term>

⇒ a = <var> + <term>

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 8/11
2/17/2018 Book

⇒ a = b + <term>

⇒ a = b + const

Every string of symbols in the derivation, including <program> is a


sentential form. A sentence is a sentential form that has only terminal
symbols. A leftmost derivation is one in which the leftmost nonterminal in
each sentential form is the one that is expanded. The derivation continues
the sentential form contains no non terminals. A derivation may be
leftmost or rightmost.

Parse Trees

Hierarchical structures of the language are called parse trees. A parse tree
for the simple statement A = B + const.

Ambiguity

A grammar G is said to be ambiguous. If it has more than one parse tree


(right or left derivation) for at least one string.

Example:

E→E+E

E→E–E

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 9/11
2/17/2018 Book

E → id

For the string id + id – id, the above grammar generates two parse trees:

The language generated by an ambiguous grammar is said to be inherently


ambiguous. Ambiguity in grammar is not good for compiler construction.
No method can detect and remove ambiguity automatically. But it can be
removed by, either writing the whole grammar again without ambiguity or
by setting and following the associativity and precedence constraints.

Associativity

If operators are present on both sides of an operand, the side on which


operator takes this operand is decided by associativity of those operators.
If the operation is left-associative, then operand will be taken by the left
operator or if the operation is right-associative, then operand will be take
the right operator.

Example

Operations like Addition, Multiplication, Subtraction and Division are left


associative. If expression contains:

id op id op id

It would be evaluated as:

(id op id) op id

For example, (id + id) + id

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 10/11
2/17/2018 Book

Operations like the exponentiation are right associative, i.e., order of


evaluation in the same expression will be:

id op (id op id)

For example, id ^ (id ^ id)

Precedence

If the two different operators share a common operand, precedence of


operators decides which will take the operand. For example, 2+3*4 can
have two different parse trees, one corresponding to (2+3)*4 and another
corresponding to 2+(3*4). By setting a precedence among operators, this
problem could be easily removed. As in the previous example,
mathematically * (multiplication) has precedence over + (addition), so
expression 2+3*4 will always be interpreted as:

2 + (3 * 4)

These methods decrease chances of ambiguity in a language or its


grammar.

4. Context-Free Grammars - Part I

https://www.ulektz.com/ulektz/eReader/sandwich/index.php?book_id=UDE2MjY1&file_id=NDYyMw==&file_name=PRINCIPLES%20OF%20PROGR… 11/11

You might also like