You are on page 1of 26

CSCI 431 Programming Languages

Fall 2003

Syntax Analysis
(Section 2.2-2.3)

A modification of slides developed by Felix


Hernandez-Campos at UNC Chapel Hill

1
Review: Compilation/Interpretation

Source Code

Compiler or Interpreter
Interpre-
Translation Execution
tation

Target Code

2
Review: Syntax Analysis

• Specifying the form


Source Code
of a programming
language

– Tokens Compiler or Interpreter


» Regular Expressions
(also F.A.s & Reg. Grammars)
Translation Execution

– Syntax
» Context-Free
Grammars Target Code
(also P.D.A.s)

3
Phases of Compilation

4
Syntax Analysis

• Syntax:
– Webster’s definition: 1 a : the way in which linguistic
elements (as words) are put together to form constituents
(as phrases or clauses)
• The syntax of a programming language
– Describes its form
» Organization of tokens
» Context Free Grammars (CFGs)
– Must be recognizable by compilers and interpreters
» Parsing
» LL and LR parsers

5
Context Free Grammars

• CFGs
– Add recursion to regular expressions
» Nested constructions
– Notation
expression  identifier | number | - expression
| ( expression )
| expression operator expression
operator  + | - | * | /

» Terminal symbols
» Non-terminal symbols
» Production rule (i.e. substitution rule)
terminal symbol  terminal and non-terminal symbols

6
Parsing

• Parsing an arbitrary Context Free Grammar


– O(n3)
– Too slow for large programs
• Linear-time parsing
– LL parsers (a ‘Left-to-right, Left-most’ derivation)
» Recognize LL grammar
» Use a top-down strategy
– LR parsers (a ‘Left-to-right, Right-most’ derivation)
» Recognize LR grammar
» Use a bottom-up strategy

7
Parsing example

• Example: comma-separated list of identifier

– CFG

id_list  id id_list_tail
id_list_tail  , id_list_tail
id_list_tail  ;

– Parsing

A, B, C;

8
Top-down derivation of A, B, C;

CFG

Left-to-right,
Left-most derivation
LL(1) parsing

9
Top-down derivation of A, B, C;

CFG

10
Bottom-up parsing of A, B, C;

CFG

Left-to-right,
Right-most derivation
LR parsing
(a shift-reduce parser)

11
Bottom-up parsing of A, B, C;

CFG

12
Bottom-up parsing of A, B, C;

CFG

13
LR Parsing vs. LL Parsing

• LL
– A ‘top-down’ or ‘predictive’ parser
– Predict needed productions based on the current left-most
non-terminal in the tree and the current input token
– The top-of-stack contains the left-most non-terminal
– The stack contains a record of what the parser expects to
see
• LR
– A ‘bottom-up’ or shift-reduce parser
– Shifts tokens onto the stack until it recognizes a right-hand
side then reduces those tokens to their left-hand side
– The stack contains a record of what the parser has already
seen

14
An appropriate LR Grammar

id_list  id_list_prefix ;
id_list_prefix  id_list_prefix , id
 id
This grammar can’t be parsed top-down!
Problems for LL grammars:
- left recursion, example above
- common prefixes, example:
stmt  id := expr | id (arg_list)
15
LL(1) Grammar for the Calculator
Language

16
LR(1) Grammar for the Calculator
Language

17
Hierarchy of Linear Parsers

• Basic containment relationship


– All CFGs can be recognized by LR parser
– Only a subset of all the CFGs can be recognized by LL
parsers

CFGs LR parsing

LL parsing

18
Bigger Picture

• Chomsky Hierarchy of Grammars


Unrestricted Grammar

Context Sensitive Grammar

Context Free Grammar

Regular
Grammar

19
Implementation of an LL Parser

• Two options:
– A recursive descent parser (section 2.2.3)
» For LL grammars only
– Parse table and a driver (section 2.2.5)
» LR parsers covered in section 2.2.6

20
Recursive Descent Parser Example

• LL(1) grammar

21
Recursive Descent Parser Example

• Outline of
recursive parser

– This parser only


verifies syntax

– match is
the scanner

22
Recursive Descent Parser Example

23
Recursive Descent Parser Example

24
Recursive Descent Parser Example

A program that develops recursive decent


parsers: JavaCC

25
Semantic Analysis

• Specifying the meaning


Source Code
of a programming
language
Compiler or Interpreter
– Attribute Grammars
Translation Execution

Target Code

26

You might also like