Professional Documents
Culture Documents
• Scope
– Static vs. Dynamic scoping
– Implementation: symbol tables
• Types
– Statically vs. Dynamically typed languages
Parsing cannot catch some errors Performs checks of many kinds ...
Examples:
Some language constructs are not context-free 1.All used identifiers are declared
– Example: Identifier declaration and use 2.Identifiers declared only once
– An abstract version of the problem is: 3.Types
{ wcw | w ∈ (a + b)* } 4.Procedures and functions defined only once
5.Procedures and functions used with the right
– The 1st w represents the identifier’s declaration; number and type of arguments
the 2nd w represents a use of the identifier And others . . .
name: character string (obtained from scanner) • The scope of an identifier (a binding of a name
scope: to the entity it names) is the textual part of
type: the program in which the binding is active
- integer
- array: • Scope matches identifier declarations with uses
• number of dimensions
• upper and lower bounds for each dimension
– Important static analysis step in most languages
• type of elements
– function:
• number and type of parameters (in order)
• type of returned value
• size of stack frame
• The scope of an identifier is the portion of a • Most languages have static (lexical) scope
program in which that identifier is accessible – Scope depends only on the physical structure of
program text, not its run-time behavior
– The determination of scope is made by the compiler
• The same identifier may refer to different
– C, Java, ML have static scope; so do most languages
things in different parts of the program
– Different scopes for same name don’t overlap • A few languages are dynamically scoped
– Lisp, SNOBOL
– Lisp has changed to mostly static scoping
• An identifier may have restricted scope
– Scope depends on execution of the program
Program scopes (input, output); • With dynamic scope, bindings cannot always be
var a: integer; resolved by examining the program because
procedure first; With static scope they are dependent on calling sequences
rules, it prints 1
begin a := 1; end; • Dynamic scope rules are usually encountered in
procedure second; interpreted languages
var a: integer; With dynamic scope
begin first; end; rules, it prints 2 • Also, usually these languages do not normally
begin have static type checking as type
a := 2; second; write(a); determination is not always possible when
end. dynamic rules are in effect
• In most programming languages identifier • Not all kinds of identifiers follow the most-
bindings are introduced by closely nested scope rule
– Function declarations (introduce function names)
– Procedure definitions (introduce procedure names) • For example, function declarations
– Identifier declarations (introduce identifiers) – often cannot be nested
– Formal parameters (introduce identifiers) – are globally visible throughout the program
• Much of semantic analysis can be expressed as • Example: the scope of variable declarations is
a recursive descent of an AST one subtree
– Process an AST node n let integer x ← 42 in E
– Process the children of n
– Finish processing the AST node n
• x can be used in subtree E
Common implementations: linked lists, hash tables • A symbol table is a data structure that tracks
the current bindings of identifiers
Compile Design I (2011) 23 Compile Design I (2011) 24
A Simple Symbol Table Implementation Limitations
• enter_scope() start/push a new nested scope • Function names can be used prior to their
• find_symbol(x) finds current x (or null) definition
• add_symbol(x) add a symbol x to the table • We can’t check that for function names
• check_scope(x) true if x defined in current – using a symbol table
– or even in one pass
scope
• exit_scope() exits/pops the current scope • Solution
– Pass 1: Gather all function/procedure names
– Pass 2: Do the checking
• Semantic analysis requires multiple passes
– Probably more than two
• Consensus
– A type is a set of values and
What are the types of $r1, $r2, $r3?
– A set of operations on those values
• Certain operations are legal for values of each • A language’s type system specifies which
type operations are valid for which types
– It doesn’t make sense to add a function pointer and • The goal of type checking is to ensure that
an integer in C operations are used with the correct types
– Enforces intended interpretation of values,
– It does make sense to add two integers because nothing else will!
– But both have the same assembly language • Type systems provide a concise formalization
implementation! of the semantic checking rules
• Competing views on static vs. dynamic typing • In practice, most code is written in statically
typed languages with an “escape” mechanism
• Static typing proponents say: – Unsafe casts in C, Java
– Static checking catches many programming errors
at compile time • It is debatable whether this compromise
– Avoids overhead of runtime type checks represents the best or worst of both worlds