Lecture 3-4&5

Lecture 3
Chapter 3
Describing Syntax & Semantics
Principles of Programming languages

TextBook
Principles of Programming
Languages, Robert W.
Sebesta, (10th edition),
Addison-Wesley Publishing
Company
- Principles of Programming languages

Topics
• Introduction
• The General Problem of Describing Syntax
• Formal Methods of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Programs:
• Dynamic Semantics

Introduction
• Syntax: the form or structure of the expressions,
statements, and program units
• Semantics: the meaning of the expressions,
statements, and program units
• Syntax and semantics provide a language’s definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)

The General Problem of Describing Syntax:
Terminology
A sentence is a string of
characters over some
alphabet
A language is a set of
sentences
A lexeme is the lowest
level syntactic unit of a
language (e.g., *, sum,
begin)
A token is a
category of lexemes
(e.g., identifier)
Formal Definition of Languages
• Recognizers
– A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to the language
– Example: syntax analysis part of a compiler
• Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of the
generator

BNF and Context-Free Grammars
• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the syntax of
natural languages
– Define a class of languages called context-free languages
• Backus-Naur Form (1959)
– Invented by John Backus to describe Algol 58
– BNF is equivalent to context-free grammars

BNF Fundamentals
• In BNF, abstractions are used to represent classes of syntactic structures --
they act like syntactic variables (also called nonterminal symbols, or just
terminals)
• Terminals are lexemes or tokens
• A rule has a left-hand side (LHS), which is a non-terminal, and a right-hand
side (RHS), which is a string of terminals and/or non-terminals
• Non-terminals are often enclosed in angle brackets
– Examples of BNF rules:
<ident_list> → identifier | identifier, <ident_list>

<if_stmt> → if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules

• A start symbol is a special element of the nonterminals of a grammar

BNF Rules
• An abstraction (or nonterminal symbol) can
have more than one RHS
<stmt> → <single_stmt>
| begin <stmt_list> end

Describing Lists
• Syntactic lists are described using recursion
<ident_list> → ident | ident, <ident_list>
• A derivation is a repeated application of rules,
starting with the start symbol and ending with
a sentence (all terminal symbols)

An Example Grammar

Derivation

An Example Grammar & Derivation
<program> → <stmts> <program> => <stmts>

 <stmts> => <stmt>
<stmts> → <stmt> | <stmt> ; <stmts>
 <var> = <expr>
<stmt> → <var> = <expr>  a = <expr>
 a = <term> + <term>
<var> → a | b | c | d  a = <var> + <term>
 a = b + <term>
<expr> → <term> + <term> |
<term> - <term>  a = b + const
<term> → <var> | const

Question
How will this grammar generate the following statement,

written by a programmer ?
Solution
is generated by
An Example Derivation
<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const

Derivations
• Every string of symbols in a derivation is a
sentential form
• A sentence is a sentential form that has only
terminal symbols
• A leftmost derivation is one in which the
leftmost nonterminal in each sentential form
is the one that is expanded

Parse Tree
• A hierarchical representation of a derivation

Ambiguity in Grammars
• A grammar is ambiguous if and only if it generates a sentential
form that has two or more distinct parse trees
• some string that it can generate in more than one way
• semantic information is needed to select the intended parse
• For example, in C the following:
– x * y ; can be interpreted as either:
• the declaration of an identifier named y of type pointer-to-x, or
• an expression in which x is multiplied by y and then the result is
discarded.

An Ambiguous Expression Grammar

An Unambiguous Expression Grammar
• If we use the parse tree to indicate precedence levels of the operators, we
cannot have ambiguity.

• Parse Trees represent the inherent
hierarchical structures in a language sentence.
A parse tree and a derivation are
representations of the same process
Associativity of Operators

Associativity of Operators

Converting Ambiguous Grammar Into Un-
Ambiguous Grammar

Ambiguous Grammar

Ambiguous Grammar

Ambiguous Grammar

Ambiguous Grammar

Ambiguous Grammar

Graded Activity: Questions
How to add precedence rules?
What is the problem with

ambiguity?
Is the C++ grammar

ambiguous?
Can you find a sentence with two possible parse trees in this
grammar?

Lecture 4
Chapter 3

Extended BNF
• Optional parts (0 or one) are placed in
brackets [ ]
<proc_call> → ident [(<expr_list>)]
• Alternative parts of RHSs are placed inside
parentheses and separated via vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside
braces { }
<ident> → letter {letter|digit}
BNF and EBNF

BNF and EBNF

Question
• Convert the following grammar from BNF to
EBNF

Answer

Question
Convert the following BNF to EBNF
<assign> → <id> = <expr>

<id> → A | B | C
<expr> → <expr> + <expr>
| <expr> * <expr>
| ( <expr> )
| <id>
Answer

Semantics
• BNF in the form of CFG is used to describe the syntax
of a programming language
• It can not describe the semantics (meaning) of the
programming language statements, expressions, etc.
• There are two categories of semantics:
– Static semantics: semantic rules that can be checked
during compile time (i.e., prior to runtime). As an example,
variables in a program must be "declared" before they are
referenced.
– Dynamic semantics: semantic rules that apply during the
execution of a program.

Static Semantics Examples
There are some characteristics of the structure of programming languages that are
difficult to describe with BNF, and some that are impossible.
As an example of a syntax rule that is difficult to specify with BNF, consider type
compatibility rules. In Java, for example, a floating-point value cannot be assigned to
an integer type variable, although the opposite is legal. Although this restriction can
be specified in BNF, it requires additional non-terminal symbols and rules. If all of the
typing rules of Java were specified in BNF, the grammar would become too large to be
useful, because the size of the grammar determines the size of the syntax analyzer.
As an example of a syntax rule that cannot be specified in BNF, consider the common
rule that all variables must be declared before they are referenced. It has been proven
that this rule cannot be specified in BNF

Attribute Grammar
• Attribute grammar is a special form of context-free grammar where some
additional information (attributes) are appended to one or more of its
non-terminals in order to provide context-sensitive information.
• Each attribute has well-defined domain of values, such as integer, float,
character, string, and expressions.
• Attribute grammar is a medium to provide semantics to the context-free

grammar and it can help specify the syntax and semantics of a
programming language.
• Attribute grammar (when viewed as a parse-tree) can pass values or
information among the nodes of a tree.

Example:
• E → E + T { E.value = E.value + T.value }
• The right part of the CFG contains the

semantic rules that specify how the grammar
should be interpreted. Here, the values of
non-terminals E and T are added together and
the result is copied to the non-terminal E.

Attribute Grammars
• Attribute grammars (AGs) have additions to

CFGs to carry some semantic info on parse
tree nodes
• Primary value of AGs:

– Static semantics specification
– Compiler design (static semantics checking)

Attribute Grammars : Definition
• Def: An attribute grammar is a context-free
grammar G = (S, N, T, P) with the following
additions:
– For each grammar symbol x there is a set A(x) of
attribute values
– Each rule has a set of functions that define certain
attributes of the nonterminals in the rule
– Each rule has a (possibly empty) set of predicates
to check for attribute consistency

Attribute Grammars (continued)
Attributes
– these are ‘variables’ associated with grammar symbols,
and can have various values attached to them.
Attribute computation functions
– these are functions associated with grammar rules
(productions) and specify how attributed values are
computed.
Predicate functions
– These are associated with grammar rules (productions)
and state some of the syntax and static semantic rules of
the language.
– Predicate functions, which state the static semantic rules
of the language, are associated with grammar rules.
Attribute Grammars: Definition
• Let X0  X1 ... Xn be a rule
• Functions of the form S(X0) = f(A(X1), ... , A(Xn)) define synthesized

attributes
• Functions of the form I(Xj) = f(A(X0), ... , A(Xn)), for i <= j <= n, define
inherited attributes
• Initially, there are intrinsic attributes on the leaves
• The first attributes to be evaluated are the synthetic attributes of the leaf
nodes: called intrinsic attributes: which might be values read-off the
symbol table, e.g. type values etc

Attributes
– actual_type: a synthetic attribute for non-terminals,

storing their actual type: int or real.
• In case of a variable intrinsic-attributes are read-off.
• In case of an expression the value of the attributed is
determined as a function of its children nodes.
– expected_type: an inherited attribute associated with

the non-terminal <expr>
• stores the type value expected of the <expr> given the type
value of the variable on the left hand side.
Attribute Grammars: An Example
• Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C
• actual_type: synthesized for <var> and
<expr>
• expected_type: inherited for <expr>

Attribute Grammar (continued)
• Syntax rule: <expr>  <var>[1] + <var>[2]
Semantic rules:
<expr>.actual_type  <var>[1].actual_type
Predicate:
<var>[1].actual_type == <var>[2].actual_type
<expr>.expected_type == <expr>.actual_type
• Syntax rule: <var>  id

Semantic rule:
<var>.actual_type  lookup (<var>.string)

Attribute Grammar (continued)
Attributes
– actual_type: a synthetic attribute for non-terminals,

storing their actual type: int or real.
• In case of a variable intrinsic-attributes are read-off.
• In case of an expression the value of the attributed is
determined as a function of its children nodes.
– expected_type: an inherited attribute associated with

the non-terminal <expr>
• stores the type value expected of the <expr> given the type
value of the variable on the left hand side.
• How are attribute values computed?
– If all attributes were inherited, the tree could be
decorated in top-down order.
– If all attributes were synthesized, the tree could be
decorated in bottom-up order.
– In many cases, both kinds of attributes are used,
and it is some combination of top-down and
bottom-up that must be used.

<expr>.expected_type  inherited from parent
<var>[1].actual_type  lookup (A)

<var>[2].actual_type  lookup (B)
<var>[1].actual_type =? <var>[2].actual_type
<expr>.actual_type  <var>[1].actual_type
<expr>.actual_type =? <expr>.expected_type

Example, parse tree for: A = A + B
Assuming this three has already been constructed, how do we compute the attributes?
A fully attributed parse tree

Attribute Grammar: Example 2
• Grammar of signed binary numbers and
computing the values of the number
Lecture 5
Chapter 3

Semantics
• There is no single widely acceptable notation
or formalism for describing semantics
• Several needs for a methodology and
notation for semantics:
– Programmers need to know what statements mean
– Compiler writers must know exactly what language constructs do
– Correctness proofs would be possible
– Compiler generators would be possible
– Designers could detect ambiguities and inconsistencies

Operational Semantics
• Operational Semantics
– Describe the meaning of a program by executing
its statements on a machine, either simulated or
actual. The change in the state of the machine
(memory, registers, etc.) defines the meaning of
the statement
• To use operational semantics for a high-level
language, a virtual machine is needed





Operational Semantic Examples

Operational Semantic Examples

Denotational Semantics
• Based on recursive function theory
• The most abstract semantics description
method
• Originally developed by Scott and Strachey
(1970)
• In denotational semantics, the domain is
called the syntactic domain, because it is
syntactic structures that are mapped. The
range is called the semantic domain.
• The method is named denotational because
the mathematical objects denote the meaning
of their corresponding syntactic entities.




• This definition requires two auxiliary functions defined in the semantic world,
where Number x Number denotes the Cartesian product.
We need two syntactic domains for the language of numerals. Phrases in this language are mapped into the
mathematical domain of natural numbers. Generally we have one semantic function for each syntactic domain
and one semantic equation for each production in the abstract syntax. To distinguish numerals (syntax) from
numbers (semantics), different typefaces are employed. Note the compositionality of the definition in that the
value of a phrase “N D” is defined in terms of the value of N and the value of D.
Question
• Write a denotational semantics mapping
function for the following statements:
a. Ada for
b. Java do-while
c. Java Boolean expressions
d. Java for
e. C switch

Axiomatic Semantics
• Based on formal logic (predicate calculus)
• Original purpose: formal program verification
• Axioms or inference rules are defined for each
statement type in the language (to allow
transformations of logic expressions into more
formal logic expressions)
• The logic expressions are called assertions

Axiomatic Semantics
• An assertion before a statement (a precondition)
states the relationships and constraints among
variables that are true at that point in execution
• An assertion following a statement is a postcondition
• A weakest precondition is the least restrictive

precondition that will guarantee the postcondition

Axiomatic Semantics Form

Axiomatic Semantics Form

Example
where I is the loop invariant (the inductive hypothesis)

• In the case of the assignment statement, the
notation is
• example of computing a precondition for an

assignment statement, consider the following:
• x = 2 * y - 3 {x > 25}
• The precondition is computed as follows:
• 2 * y - 3 > 25
• y > 14
• So {y > 14} is the weakest precondition for this
assignment statement and postcondition.
• . For example, consider the logical statement

For example, it allows the completion of the proof of the last logical
statement .
If we let P be {x > 3}, Q and Q be {x > 0}, and P be {x > 5}, we have

Denotation Semantics Vs Operational
Semantics
• In operational semantics, the state changes

are defined by coded algorithms
• In denotational semantics, the state changes

are defined by rigorous mathematical functions

Question
• Compute the weakest precondition for each of
the following assignment statements and
postconditions:
• a = 2 * (b - 1) - 1 {a > 0}
• b = (c + 10) / 3 {b > 6}
• a = a + 2 * b - 1 {a > 1}
• x = 2 * y + x - 1 {x > 11}

Answers
• a = 2 * (b - 1) - 1 {a > 0}
• 2 * (b - 1) - 1 > 0
• 2*b-2-1>0
• 2*b>3
• b>3/2

Answer
• b = (c + 10) / 3 {b > 6}
• (c + 10) / 3 > 6
• c + 10 > 18
• c>8

Answer
• a = a + 2 * b - 1 {a > 1}
• a+2*b-1>1
• 2*b>2-a
• b>1-a/2

Answer
• x = 2 * y + x - 1 {x > 11}
• 2 * y + x - 1 > 11
• 2 * y + x > 12

Lecture 3-4&amp;5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 3-4&amp;5

Uploaded by

Copyright:

Available Formats

Lecture 3

Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

<ident_list> → identifier | identifier, <ident_list>

• Grammar: a finite non-empty set of rules

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

<program> → <stmts> <program> => <stmts>

<term> → <var> | const

- Principles of Programming languages

How will this grammar generate the following statement,

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

How to add precedence rules?

What is the problem with

Is the C++ grammar

- Principles of Programming languages

Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

<assign> → <id> = <expr>

- Principles of Programming languages

- Principles of Programming languages

- Principles of Programming languages

• Attribute grammar is a medium to provide semantics to the context-free

- Principles of Programming languages

• The right part of the CFG contains the

- Principles of Programming languages

• Attribute grammars (AGs) have additions to

• Primary value of AGs:

- Principles of Programming languages

- Principles of Programming languages

• Functions of the form S(X0) = f(A(X1), ... , A(Xn)) define synthesized

• Initially, there are intrinsic attributes on the leaves

- Principles of Programming languages

– actual_type: a synthetic attribute for non-terminals,

– expected_type: an inherited attribute associated with

- Principles of Programming languages

• Syntax rule: <var>  id

- Principles of Programming languages

– actual_type: a synthetic attribute for non-terminals,

– expected_type: an inherited attribute associated with

- Principles of Programming languages

Lecture 3-4&5

Lecture 3-4&5