You are on page 1of 46

LECTURE

03
DESCRIBING
SYNTAX
AND SEMANTICS
CSE 325/CSE 425:
CONCEPTS OF
PROGRAMMING
LANGUAGE

INSTRUCTOR: DR. F. A. FAISAL


CONTENT

• Intro
• The General Problem of Describing Syntax
• Formal Method of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Dynamic
Programs: Semantics
INTRODUCTION
• Syntax:
• The form or the expressions, statements and
structure of program
units
• Semantics:
• The meaning of the expressions, statements, and program units.
• E.G.-
• The syntax of a Java while statement is-
while (boolean_expr) statements
• The semantics of this statement form is-
• If (boolean_expr == true) then {statements will be executed}
• If (boolean_expr == false) then {statements will be skipped}
• Syntax and semantics provide a language’s definition and they
are closely related.
T H E G E N E R A LP R O B L E MO
F DESCRIBING SYNTAX:
TERMINOLOGY
• A sentence is a string of characters over some alphabet
• A language is a set of meaningful sentences
• A lexeme is the lowest level syntactic unit of a language (e.g.
*, sum, begin)
• An token is an object describing the lexeme.
• A token has a type (e.g. Keyword, identifier, Number, or
Operator) and a value (the actual characters of the described
lexeme).
• Lexemes are into groups (e.g. the names
partitioned
variables, classes and so forth) of is
methods, called as
identifiers.
EXAMPLE
Lexemes Tokens
index identifier
= equal_sign
2 Int_literal
• index = 2 * count + 17; * Mult_op
count Identifier
+ Plus_op
17 Int_literal
; semicolon
TERMINOLOGY

• Recognizers-
• A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to language.
• Ex- Syntax analysis part of a computer
• Generators
• A device that generates sentences of a language
• One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of the
generator.
BNF AND
CFG
• Noam Chomsky and John Backus, developed the same
syntax description formalism, which became the most widely
used method for programming language syntax.
• Context-Free Grammars
• Developed by Noam Chomsky in the mid-1950s
• Language generators, meant to describe the syntax of natural
languages
• Define a class of languages called context-free languages
• Backus-Naur Form (1959)
• Invented by John Backus to describe the syntax of ALGOL 58
• BNF is equivalent to context-free grammers.
BNF
FUNDAMENTALS
• A metalanguage is a language that is used to
describe another language.
• BNF is a metalanguage for programming languages.
• BNF uses abstractions for syntactic structures.
• <assign> <var> = <expression>
• In BNF, abstractions are used to represent classes of
syntactic structures – they act like syntactic variables (also
called nonterminal symbols, or just nonterminals).
• Terminals are lexemes or tokens
• A rule has a left-hand side (LHS), which is a nonterminal, a
right-hand side (RHS), which is a string of terminals and/or
nonterminals.
BNF
FUNDAMENTALS
• Nonterminals are often enclosed in angle brackets.
• Example-
• <ident_list> identifier | identifier, <ident_list>
• <if_stmt> if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
• A start symbol is a special element of the nonterminals of a
grammar
• BNF Rules-
• An abstraction (or nonterminal symbol) can have more
than one RHS
• <stmt> <single_stmt>
| begin <stmt_list> end
DESCRIBING
LISTS

• Syntactic lists are described using recursion


<ident_list> ident
| ident, <ident_list>
• A derivation is a repeated application of rules, starting
with the start symbol and ending with a sentence (all
terminal symbols)
AN
EXAMPLE
GRAMMAR
AN EXAMPLE
DERIVATION
DERIVATION
S

• Every string of symbols in a derivation is a sentential form


• A sentence is a sentential form that has only
terminal symbols
• A leftmost derivation is one in which the leftmost
nonterminal in each sentential form is the one that is
expanded
• A derivation may be neither leftmost nor rightmost.
ANOTHER
EXAMPLE
PARSE
• TREE
A hierarchical representation
of a derivation
AMBIGUITY
IN
GRAMMARS
• A grammar is ambiguous if and only if it generates a
sentential form that has two or more distinct parse trees.
• Like to find the parse tree for A = B + C * A
PARSE
TREE
Rightmost
Derivation RMD

R
P

E
C
O

E
D
E Leftmost
Derivation LMD
O
PARSE TREE
(UNIQUE) If we use the parse tree to indicate
precedence levels of the operators, we
cannot have ambiguity
ASSOCIATIVITY
OF OPERATORS
• When an expression includes
two operators that have the
same precedence ( * & /) – a
semantic rule is required to
specify which should have
precedence.
• This rule is n am e d as
associativity.
• Example-
• A=B+C+A
ASSOCIATIVITY
• Operators associativity can also be indicated by a grammar.
EXTENDED
BNF
• Incorporated for few minor inconvenience in BNF.
• Most extended versions are called as Extended BNF, or
simply EBNF.
• The extensions do not enhance the descriptive power of BNF;
they only increase its readability and writability.
• Optional parts are placed in brackets [ ].
<proc_call> ident [ (<expr_list>) ]
• Alternative parts of RHSs are placed inside parentheses and
separated via vertical bars.
<term> <term> (+|-) const
• Repetitions (0 or more) are placed inside braces { }
<ident> letter { letter | digit }
EXAMPLE
VARIATIONS ON
BNF AND EBNF
• In place of the arrow, a colon is used and the RHS is placed on the next
line.
• Instead of a vertical bar to separate alternative RHS, they are simply
placed on separate lines.
• In place of square brackets to indicate something being optional, the
subscript opt is used.
• Ex- Constructor Declarator SimpleName
(FormalParameterListopt)
• Rather than using the | symbol in a parenthesized list of elements to
indicate a choice, the words ‘one of’ are used.
• Ex- Assignment Operator one of = *= /=
ATTRIBUTE
GRAMMARS
(AGS)
• Is a device used to describe more of the structure of a
programming language than can be described with a
context-free grammar.
• An extension to a context-free grammar.
• AGS have additions to CFGs to carry some semantic info
on parse tree nodes.
• Primary value of AGs:
• Static semantics specification
• Compiler design (static semantics checking)
STATIC
SEMANTICS
• Nothing to do with meaning
• CFGs cannot describe all of the syntax of programming
languages
• The problems exemplify the categories of language rules
(called as static semantic rules) :
• Context-free, but cumbersome (e.g., types of operands in
expressions)
• Non-context-free (e.g., variables must be declared before
they are used)
ATTRIBUTE
GRAMMARS:
DEFINITION
• Def: An attribute grammer is a context-free grammar G =
(S, N, T, P) with the following additions:
• For each grammar symbol x there is a set A(x) of attribute
values
• Each rule has a set of functions that define certain
attributes of the nonterminals in the rule
• Each rule has a (Possibly empty) set of predicates to
check for attribute consistency.
ATTRIBUTE
GRAMMARS:
(CONT.)
• Let Xo X1 …… Xn be a rule (X is a set of attributes)
• Functions of the form S(X0) f(A(X1), …., A(Xn)) define as
synthesized attributes
• Functions of the form I(XJ) f(A(X1), …., A(Xn)),
for i <= j <= n, define as inherited attributes
• Initially, there are intrinsic attributes on the leaves.
• Intrinsic attributes are synthesized attributes of
leaf nodes whose values are determined outside the
parse tree.
• Ex- the type of an instance of a variable in a program could
come from the symbol table.
ATTRIBUTE
GRAMMAR EXAMPLE

(var = actual type is intrinsic, expr = actual type of child node)

(expr = actual type determined by the assignment statement)


ATTRIBUTE GRAMMAR (FOR SIMPLE
ASSIGNMENT STATEMENTS)
A PARSE TREE (A = A +
B)
COMPUTING
ATTRIBUTE
VALUES
• How are attribute values computed?
• If all attributes were inherited, the tree could be decorated
in top-down order,
• If all attributes were synthesized, the tree could be
decorated in bottom-up order.
• In many cases, both kinds of attributes are used, and it is
some combination of top-down and bottom-up that must be
used.
COMPUTING
ATTRIBUTE
VALUES
FLOW OF ATTRIBUTE
VALUES
FINAL
ATTRIBUTE
VALUES
SEMANTICS
• There is no single widely acceptable notation or formalism
for describing semantics
• Several needs for a methodology and notation for
semantics:
• Programmers need to know what statements
mean
• Compiler writers must know exactly what language
constructs do
• Correctness proofs would be possible
• Compiler generators would be possible
• Designers could detect ambiguities and
inconsistencies.
OPERATIONA
L SEMANTICS
• Describe the meaning of a program by executing
statements on a machine, either simulated or actual. its
change in the state of the machine (memory, registers etc.)
The
defines the meaning of the statement.
• To use operational semantics for a high-level language, a
virtual machine is needed.
• A hardware pure interpreter would be too expensive
• A software pure interpreter also has problems
• The detailed characteristics of the particular
computer would make actions difficult to understand.
• Such a semantic definition would be machine – dependent.
OPERATIONAL
SEMANTICS
(CONT.)
• A better alternative: A complete computer simulation
• The process:
• Build a translator
• Build a simulator for the idealized computer
• Evaluation of operational semantics:
• Good if used informally ( language manuals, etc)
• Extremely complex if used formally (e.g., VHDL), it was
used for describing semantics of PL/I
OPERATIONA
L SEMANTICS

• Uses of operational semantics:


• Language manuals and textbooks
• Teaching programming languages
• Two different levels of uses of operational semantics:
• Natural operational semantics
• Structural operational semantics
• Evaluation
• Good if uesd informally (language manuals etc.)
• Extremely complex if used formally (e.g. VHDL)
DENOTATIONA
L SEMANTICS
• Based on recursive function theory
• The most abstract semantics description method
• Originally developed by Scott and Strachey (1970).
• The process of building a denotational specification for a
language:
• Define a mathematical object for each language entity
• Define a function that maps instances of the language
entities onto instances of the corresponding mathematical
objects.
• The meaning of language constructs are defined by only
the values of the program’s variables.
DENOTATIONAL SEMANTICS:
PROGRAM STATE

• The State of a program is the values (v is the values for


corresponding variables) of all its current variables ( i is
the name of the variables)
• S = { <i1 , v1> , <i2 , v2>, … , <in, vn>}

• Let VARMAP be a function that, when given a variable


name and a state, returns the current value of the variable
• VARMAP (ij, s) = vJ
EXAMPLE
(DECIMAL
NUMBERS)

The Denotational mapping for <dec_num> -


EXPRESSIONS
• Consider the following BNF Expression-

• The only error we consider in expressions is a variable having


an undefined value.
• Let Z is set of integers and let error be the error value.
• The Z U {error} is the semantic domain for the
denotational specification for our expressions.
• Assumptions, one arithmetic operator with two operands.
EXPRESSIONS
ASSIGNMENT

STATEMENTS
• Is an expression evaluation plus the setting of the target
variable to the expression’s value.
• Maps state sets to state sets U {error}

Checking of names
THANKS

You might also like