You are on page 1of 38

Lecture 2

Describing Syntax and


Semantics

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-1


Lecture 2 Topics

• Introduction
• The General Problem of Describing Syntax
• Formal Methods of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Programs:
Dynamic Semantics

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-2


Introduction

• Syntax: the form or structure of the


expressions, statements, and program
units
• Semantics: the meaning of the expressions,
statements, and program units
• Syntax and semantics provide a language’s
definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-3


The General Problem of Describing
Syntax: Terminology

• A sentence is a string of characters over


some alphabet

• A language is a set of sentences

• A lexeme is the lowest level syntactic unit


of a language (e.g., *, sum, begin)

• A token is a category of lexemes (e.g.,


identifiers, constants, key words)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-4


Formal Definition of Languages

• Recognizers
– A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to the language
– Example: syntax analysis part of a compiler

• Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of
the generator

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-5


BNF and Context-Free Grammars

• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the
syntax of natural languages
– Define a class of languages called context-free
languages

• Backus-Naur Form (1959)


– Invented by John Backus to describe the syntax
of Algol 58
– BNF is equivalent to context-free grammars
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-6
BNF Fundamentals

• In BNF, abstractions are used to represent classes


of syntactic structures--they act like syntactic
variables (also called nonterminal symbols, or just
terminals)

• Terminals are lexemes or tokens

• A rule has a left-hand side (LHS), which is a


nonterminal, and a right-hand side (RHS), which is
a string of terminals and/or nonterminals

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-7


BNF Fundamentals (continued)

• Nonterminals are often enclosed in angle brackets

– Examples of BNF rules:


<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>

• Grammar: a finite non-empty set of rules

• An abstraction (or nonterminal symbol) can have


more than one RHS
<stmt>  <single_stmt>
| begin <stmt_list> end

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-8


Describing Lists

• Syntactic lists are described using


recursion
<ident_list>  ident
| ident, <ident_list>

• A derivation is a repeated application of


rules, starting with the start symbol and
ending with a sentence (all terminal
symbols)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-9


An Example Grammar And Derivation
<program>  <stmts>
<stmts>  <stmt> | <stmt> ; <stmts>
<stmt>  <var> = <expr>
<var>  a | b | c | d
<expr>  <term> + <term> | <term> - <term>
<term>  <var> | const
• An example statement: a = b + const
<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-10
Derivations

• Every string of symbols in a derivation is a


sentential form
• A sentence is a sentential form that has
only terminal symbols
• A leftmost derivation is one in which the
leftmost nonterminal in each sentential
form is the one that is expanded
• A derivation may be neither leftmost nor
rightmost

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-11


Parse Tree

• A hierarchical representation of a derivation


<program>

<stmts>

<stmt>

<var> = <expr>

a <term> + <term>

<var> const

b
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-12
Exercise in class

Use the grammar above to generate the sentence below and its
corresponding parse three

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-13


Solution

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-14


Ambiguity in Grammars

• A grammar is ambiguous if and only if it


generates a sentential form that has two
or more distinct parse trees

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-15


An Ambiguous Expression Grammar

<expr>  <expr> <op> <expr> | const


<op>  / | -

<expr> <expr>

<expr> <op> <expr> <expr> <op> <expr>

<expr> <op> <expr> <expr> <op> <expr>

const - const / const const - const / const

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-16


An Unambiguous Expression Grammar

• If we use the parse tree to indicate


precedence levels of the operators, we
cannot have ambiguity
<expr>  <expr> - <term> | <term>
<term>  <term> / const| const

<expr>

<expr> - <term>

<term> <term> / const

const const
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-17
Associativity of Operators

• Operator associativity can also be indicated by a


grammar

<expr> -> <expr> + <expr> | const (ambiguous)


<expr> -> <expr> + const | const (unambiguous)

<expr>
<expr>

<expr> + const

<expr> + const

const
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-18
Unambiguous Grammar for Selector

• Java if-then-else grammar


<if_stmt> -> if (<logic_expr>) <stmt>
| if (<logic_expr>) <stmt> else <stmt>
Ambiguous!
- An unambiguous grammar for if-then-else

<stmt> -> <matched> | <unmatched>


<matched> -> if (<logic_expr>) <matched>else<matched>
| any non-if statement
<unmatched> -> if (<logic_expr>) <stmt>
| if (<logic_expr>) <matched> else
<unmatched>
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-19
Extended BNF

• Optional parts are placed in brackets [ ]


<proc_call> -> ident [(<expr_list>)]
• Alternative parts of RHSs are placed
inside parentheses and separated via
vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside
braces { }
<ident> → letter {letter|digit}

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-20


BNF and EBNF

• BNF
<expr>  <expr> + <term>
| <expr> - <term>
| <term>
<term>  <term> * <factor>
| <term> / <factor>
| <factor>
• EBNF
<expr>  <term> {(+ | -) <term>}
<term>  <factor> {(* | /) <factor>}

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-21


Recent Variations in EBNF

• Alternative RHSs are put on separate lines


• Use of a colon instead of =>
• Use of opt for optional parts
• Use of oneof for choices

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-22


Static Semantics

• Nothing to do with meaning


• Context-free grammars (CFGs) cannot
describe all of the syntax of programming
languages
• Categories of constructs that are trouble:
- Context-free, but cumbersome (e.g.,
types of operands in expressions)
- Non-context-free (e.g., variables must
be declared before they are used)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-23


Attribute Grammars

• Attribute grammars (AGs) have additions


to CFGs to carry some semantic info on
parse tree nodes

• Primary value of AGs:


– Static semantics specification
– Compiler design (static semantics checking)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-24


Semantics

• There is no single widely acceptable


notation or formalism for describing
semantics
• Several needs for a methodology and
notation for semantics:
– Programmers need to know what statements mean
– Compiler writers must know exactly what language
constructs do
– Correctness proofs would be possible
– Compiler generators would be possible
– Designers could detect ambiguities and inconsistencies

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-25


Operational Semantics

• Operational Semantics
– Describe the meaning of a program by
executing its statements on a machine, either
simulated or actual. The change in the state of
the machine (memory, registers, etc.) defines
the meaning of the statement
• To use operational semantics for a high-
level language, a virtual machine is needed

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-26


Operational Semantics (continued)

• The process:
– Build a translator (translates source code to the
machine code of an idealized computer)
– Build a simulator for the idealized computer
• Evaluation of operational semantics:
– Good if used informally (language manuals, etc.)
– Extremely complex if used formally. It was used
for describing semantics of PL/I.

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-27


Denotational Semantics

• Based on recursive function theory


• The most abstract semantics description
method
• Originally developed by Scott and Strachey
(1970)

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-28


Denotational Semantics - continued

• The process of building a denotational


specification for a language:
- Define a mathematical object for each language
entity
– Define a function that maps instances of the
language entities onto instances of the
corresponding mathematical objects
• The meaning of language constructs are
defined by only the values of the program's
variables

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-29


Denotational Semantics: program state

• The state of a program is the values of all


its current variables
s = {<i1, v1>, <i2, v2>, …, <in, vn>}

• Let VARMAP be a function that, when given


a variable name and a state, returns the
current value of the variable
VARMAP(ij, s) = vj

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-30


Evaluation of Denotational Semantics

• Can be used to prove the correctness of


programs
• Provides a rigorous way to think about
programs
• Can be an aid to language design
• Has been used in compiler generation
systems
• Because of its complexity, it is of little use
to language users

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-31


Axiomatic Semantics

• Based on formal logic


• Original purpose: formal program
verification
• Axioms or inference rules are defined for
each statement type in the language (to
allow transformations of logic expressions
into more formal logic expressions)
• The logic expressions are called assertions

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-32


Axiomatic Semantics (continued)
• An assertion before a statement (a
precondition) states the relationships and
constraints among variables that are true at
that point in execution
• An assertion following a statement is a
postcondition
• A weakest precondition is the least
restrictive precondition that will guarantee
the postcondition

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-33


Axiomatic Semantics Form

• Pre-, post form: {P} statement {Q}

• An example
– a = b + 1 {a > 1}
– One possible precondition: {b > 10}
– Weakest precondition: {b > 0}

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-34


Program Proof Process

• The postcondition for the entire program is


the desired result
– Work back through the program to the first
statement. If the precondition on the first
statement is the same as the program
specification, the program is correct.

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-35


Evaluation of Axiomatic Semantics

• Developing axioms or inference rules for all


of the statements in a language is difficult
• It is a good tool for correctness proofs, and
an excellent framework for reasoning about
programs, but it is not as useful for
language users and compiler writers
• Its usefulness in describing the meaning of
a programming language is limited for
language users or compiler writers

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-36


Denotational Semantics vs Operational
Semantics
• In operational semantics, the state changes
are defined by coded algorithms
• In denotational semantics, the state
changes are defined by rigorous
mathematical functions

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-37


Summary

• BNF and context-free grammars are


equivalent meta-languages
– Well-suited for describing the syntax of
programming languages
• An attribute grammar is a descriptive
formalism that can describe both the
syntax and the semantics of a language
• Three primary methods of semantics
description
– Operation, axiomatic, denotational

Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-38

You might also like