CSCI 312 Study Guide

Programming Languages:
Programming Languages and Natural Languages
Purpose: Both facilitate expression and communication
Programming languages facilitate communication between people and computers
and have narrower expressive domain.
Four properties of programming languages:
1.
2.
3.
4.

Syntax
Naming
Types
Semantics

For any language:
1. Its designers must define these properties
2. Its programmers must master these properties
Syntax
The syntax of a programming language is a precise description of all its
grammatically correct programs.
Questions tackled:
1. What are the basic statements for the language?
2. How do I write something?
3. Why is this a syntax error?
Naming
Many entities in a program have names: variables, types, functions, parameters,
classes, objects, …
Named entities are bound in a running program to:
1.
2.
3.
4.

Scope
Visibility
Type
Lifetime

Types
A type is a collection of values and a collection of operations on those values.
1. Simple types – numbers, characters, Booleans, …
2. Structured types – strings, lists, trees, hash tables, …
3. Complex types – functions, classes
A language’s type system can help to determine legal operations and detect type
errors

A programming paradigm is a pattern of problem-solving thought that underlies a particular genre of programs and languages . 2. Input = domain 2. Sending messages 2. 4. We learn about: 1. Perl Object-oriented (OO) Paradigm 1. Output = range Functional languages are characterized by: . Python. 3. Cobol. Smalltalk. 4. An OO Program is a collection of objects that interact by passing messages that transform the state. C#. Polymorphism Example OO languages: Java.g. Inheritance 3. There are four main programming paradigms: 1. such as function calls? How are objects allocated to memory at run-time? How do interpreters work in relation to semantics? Paradigms 1. C++.. Ada. 3. What does each statement mean? What underlying model governs run-time behavior. Object Oriented Functional Imperative Logic (declarative) Hybrid models also exist (e. and Ruby Functional Paradigm Functional programming models a computation as a collection of mathematical functions 1. C++ is imperative/OO) Imperative paradigms: - Program and data are indistinguishable in memory Program = a sequence of commands State = values of all variables when program runs Large programs use procedural abstraction Example imperative languages: C.Semantics The meaning of a program is called its semantics Questions tackled: 1. 2. Fortran.

Benefits: ease of programing and learning 2. Small instruction set: Java vs. Early binding takes place at compile-time e. Functional composition 2. Often implemented with OO or imperative programming languages c. Can occur within any paradigm 3. Leaks are prevented 4. Haskell. Simplicity and readability a. Sematic errors are properly trapped d. Type errors are detected c. A language element is bound to a property at the time that property is defined for it. c. keystroke. Unpredictable: mouse click. … Logic Paradigm Logic programming declares what outcome the program should accomplish. Program behavior is the same on different platforms b. Programs that are nondeterministic Example logic programing languages: Prolog Other topics: 1. Correctness a. Programs as sets of constraints on a problem 2. rather than how it should be accomplished (Rule based) When studying logic programming we see: 1. Recursion 3. A variable and its type and its value d. Accessible (public domain) compilers/interpreters . Examples: Java applets.1. Can be proven correct Characteristic of a successful language: 1. No assignment Example functional languages: Lisp. Examples: i. So a binding is the association between an object and a property of that object. Late binding takes place at run-time 3. Clarity about binding a. Programs that achieve all possible solutions 3. Event-Handling a. Visual Basic 2. Scheme. Tcl/Tk. Concurrency a. ML. message arrival b. Scheme b. Reliability a. b. Simple syntax: C/C++/Java vs Python c. Support a. Occurs with synchronization or parallel execution b.

Procedural i. C__ Interpreter – executes instructions on a virtual machine Example: Scheme. Fewer exceptional rules = conceptual simplicity i. Restricting types of arguments to a function c. Wide community of users d. Cobol. Embedded systems i. operators. Real time responsiveness (navigation) b. Standard function libraries 6. Python Hybrid compilation/interpretation – The Java Virtual Machine (JVM) Syntax – a precise description of all its grammatically correct programs Three levels: 1. Tradeoffs with efficiency 7. A language is orthogonal if its features are built upon a small. Responsiveness to users c.) 2. etc. Good texts and tutorials c. Programmer-defined functions ii. Grammars A metalanguage is a language used to define other languages .b. Modeling human behaviors Compilers and Virtual Machines Compiler – produces machine code Example: Fortran. values. AI applications i. Abstract syntax – internal representation of the program. Efficient search and updating d. Integrated with IDEs 5. statements and programs 3. Orthogonality a. b. Web applications i. Class libraries b. favoring content over form. Efficient implementation a. Concrete syntax – rules for writing expressions. C. Corporate database applications i. Haskell. Abstraction a. mutually independent set of primitive operations. Data i. Lexical syntax – all the basic symbols of the languages (names. Programmer-defined types/classes ii.

Each internal node of the tree corresponds to a step in the derivation 2. The children of a node represents a right-hand side of a production.A grammar is a metalanguage used to define the syntax of a language. . Our interest: Using grammars to define the syntax of a programming language. Example: Derivation of 352 as an Integer Consider the grammar: Integer Digit → Digit | Integer Digit → 0|1|2|3|4|5|6|7|8∨9 Leftmost derivation: Integer ⇒ Integer Digit ⇒ Integer 2 ⇒ Integer Digit 2 ⇒ Integer 5 2 ⇒ Digit 5 2 ⇒352 Rightmost derivation: Integer ⇒ Integer Digit ⇒ Integer Digit Digit ⇒ Digit Digit Digit ⇒ 3 Digit Digit ⇒ 3 5 Digit ⇒352 Parse trees A parse tree is a graphical representation of a derivation. 1.

. addition. and subtraction. the step Integer ⇒ Integer Digit appears in the parse tree as: Parse Tree for 352 as an Integer Arithmetic Expression Grammar: The following grammar defines the languages of arithmetic expression with 1-digit integers.3. Each leaf node represents a symbol of the derived string. E. Expr → Expr+ Term|Expr−Term|Term Term →−0∨…|9|( Expr ) .g. reading from left to right.

Associativity and precedence Precedence 3 2 1 Associativity Right Left Left Operators ** */% +- .

C++. and Java have a large number of: operators and precedence levels Ambiguous Parse of 5 – 4 +3 .A grammar is ambiguous if one of its strings has two or more different parse trees. C.

. else y=0.if ( x< 0 ) if ( y <0 ) y= y −1.

push) Literals: (123. Output: tokens 3. int.Lexical syntax Input: a stream of characters from the ASCII set. True) Keywords: (bool. Compilers and interpreters Lexer: 1. classified as follows: 1. ‘x’.25. false. 4. 5. 3. x. Separate: . I. 2. 3. float. Input: characters 2. keyed by a programmer Output: a stream of tokens or basic symbols. else. main true) Operators Punctuation No token may contain embedded whitespace (unless it is a character or string literal) Associativity is only used when there are two or more operators of same precedence. char. Identifiers: (Stack. if.

3. shell commands The shape of the parse tree reveals the meaning of the program.a. d. many nonterminals discarded Semantic Analysis: 1. So we want a tree that removes its inefficiency and keep its shape. b. . Pure interpreters: most Basics. c. Insert implied conversion operators (make them explicit) Code optimization 1. Remove separator/punctuation terminal symbols 2. 2. 4. Perl. 4. 2. Output: machine code Instruction selection Register management Peephole optimization Interpreter: 1. 2. 3. 1. Check that all identifiers are declared 2. Evaluate constant expressions at compile-time Reorder code to improve cache performance Eliminate common subexpressions Eliminate unnecessary code Code generation 1. Remove all trivial root nonterminals 3. Haskell. Based on BNF/EBNF grammar Input: tokens Output: abstract syntax tree (parse tree) Abstract syntax: parse tree with punctuation. 4. Replace remaining nonterminals with leaf terminals. Python. Speed Simpler design Character sets End of line conventions Parser: 1. Perform type checking 3. Replaces last 2 phases of a compiler 2. Scheme 3. 3. Mixed interpreters: Java.

g. . 4. Cannot be used as Identifiers 2. Reserved words: 1. 2.Recall that the term binding is an association between an entity (such as a variable) and a property (such as its value) A binding is static if the association occurs before run-time. it is local. Some languages support/require explicit dereferencing Scope The scope of a name is the collection of statements which can access the name binding. same name can be bound to different entities without interference. A binding is dynamic if the association occurs at run-time. In static scoping. A reference to a name is nonlocal if it occurs in a nested scope of the defining scope. The lifetime of a variable name refers to the time interval during which memory is allocated. 3. Most modern languages use static (or lexical) scoping. Name Address Type Value Lifetime L-value: use of a variable name to denote its address R-value: use of a variable name to denote its value. a name is bound to a collection of statements according to its position in the source program. Predefined identifiers: e. Two different scopes are either: nested or disjoint In disjoint scopes. library routines Variables Basic bindings 1. 5. otherwise. Usually identify major constructs: if while switch 3. The scope in which a name is defined or declared is called its defining scope.. Name bindings play a fundamental role.

Each time a scope is entered. 4. repeat the process on the next dictionary down in the stack. a name is bound to its most recent declaration based on the program’s call history Symbol table for each scope built at compile time. If the name is not found in any dictionary. Visibility A name is visible if its referencing environment includes the reference and the name is not redeclared in an inner scope. Dynamic Scoping In dynamic scoping. The data structure can be any implementation of a dictionary. Otherwise. generate an appropriate binding and enter the name-binding pair into the dictionary on the top of the stack. c. Resolving References For static scoping. A name redclared in an inner scope effectively hides the outer declaration. For each name declared. Given a name reference. return the binding b. but managed at run time. Each time a scope is exited. Overloading Overloading uses the number or type of parameters to distinguish among identical function names or operators. 1.Symbol Table A symbol table is a data structure kept by a translator that allows it to keep track of each declared name and its binding. Some languages provide a mechanism for referencing a hidden name. Examples: 1. 2. The referencing environment defines the set of statements which can validly reference a name. -. where the name is the key. report an error. +. Assume for now that each name is unique within its local scope. the referencing environment for a name is its defining scope and all nested subscopes. pop a dictionary off the top of the stack. search the dictionary on top of the stack: a. *. If found. / can be float or int . 3. push a new dictionary onto the stack. Scope pushed/popped on stack when entered/exited.

the equality and relational operators produce an int. So a+b may overflow the finite range. in C-like languages. Unlike mathematics: a+ ( b +c ) ≠ ( a+ b ) +c Also. + can be float or int addition or string concatenation in Java Lifetime The lifetime of a variable is the time interval during which the variable has been allocated a block of memory. A language is dynamically typed if the type of a variable can vary at run time depending on the value assigned A language is strongly typed if its type system allows all type errors in a program to be detected either at compile time or at run time. the numeric types are finite in size. Remainder of section considers mechanisms which break scope equals lifetime rule. . Earliest langauges used static allocation. A language is statically typed if the types of all variables are fixed when they are declared at compile time. Remark: Java also allows a variable to be declared static A type error is any error that arises because an operation is attempted on a data type for which it is undefined. Algol introduced the notion that memory should be allocated/deallocated at scope entry/exit.2. A type system imposes constraints such as the values used in an addition must be numeric. A type system provides a basis for detecting type errors. Global compilation scope: static 2. not a Boolean. Explicitly declaring a variable static 3. In most languages. C: 1.

Floating point add 3. thus potentially losing information.* / == != .. Generic procedures like this one are examples of parametric polymorphism. String concatenation Mixed mode: ∫ + float A type conversion is a narrowing conversion if the result type permits fewer bits. Java: + also used for string concatenation A generic function or procedure is a template that can be instantiated at compile time with concrete types and operators. A function or operation is polymorphic if it can be applied to any one of several related types and achieve the same result. Meaning of an operator changes based on the types of operand.An operator or function is overloaded when its meaning varies depending on the types of its operands or arguments or result. Example: overloaded built-in operators and functions + .. 1. An advantage of polymorphism is that it enables code reuse. . Integer add 2.