Professional Documents
Culture Documents
Fall 2003
Syntax Analysis
(Section 2.2-2.3)
1
Review: Compilation/Interpretation
Source Code
Compiler or Interpreter
Interpre-
Translation Execution
tation
Target Code
2
Review: Syntax Analysis
– Syntax
» Context-Free
Grammars Target Code
(also P.D.A.s)
3
Phases of Compilation
4
Syntax Analysis
• Syntax:
– Webster’s definition: 1 a : the way in which linguistic
elements (as words) are put together to form constituents
(as phrases or clauses)
• The syntax of a programming language
– Describes its form
» Organization of tokens
» Context Free Grammars (CFGs)
– Must be recognizable by compilers and interpreters
» Parsing
» LL and LR parsers
5
Context Free Grammars
• CFGs
– Add recursion to regular expressions
» Nested constructions
– Notation
expression identifier | number | - expression
| ( expression )
| expression operator expression
operator + | - | * | /
» Terminal symbols
» Non-terminal symbols
» Production rule (i.e. substitution rule)
terminal symbol terminal and non-terminal symbols
6
Parsing
7
Parsing example
– CFG
id_list id id_list_tail
id_list_tail , id_list_tail
id_list_tail ;
– Parsing
A, B, C;
8
Top-down derivation of A, B, C;
CFG
Left-to-right,
Left-most derivation
LL(1) parsing
9
Top-down derivation of A, B, C;
CFG
10
Bottom-up parsing of A, B, C;
CFG
Left-to-right,
Right-most derivation
LR parsing
(a shift-reduce parser)
11
Bottom-up parsing of A, B, C;
CFG
12
Bottom-up parsing of A, B, C;
CFG
13
LR Parsing vs. LL Parsing
• LL
– A ‘top-down’ or ‘predictive’ parser
– Predict needed productions based on the current left-most
non-terminal in the tree and the current input token
– The top-of-stack contains the left-most non-terminal
– The stack contains a record of what the parser expects to
see
• LR
– A ‘bottom-up’ or shift-reduce parser
– Shifts tokens onto the stack until it recognizes a right-hand
side then reduces those tokens to their left-hand side
– The stack contains a record of what the parser has already
seen
14
An appropriate LR Grammar
id_list id_list_prefix ;
id_list_prefix id_list_prefix , id
id
This grammar can’t be parsed top-down!
Problems for LL grammars:
- left recursion, example above
- common prefixes, example:
stmt id := expr | id (arg_list)
15
LL(1) Grammar for the Calculator
Language
16
LR(1) Grammar for the Calculator
Language
17
Hierarchy of Linear Parsers
CFGs LR parsing
LL parsing
18
Bigger Picture
Regular
Grammar
19
Implementation of an LL Parser
• Two options:
– A recursive descent parser (section 2.2.3)
» For LL grammars only
– Parse table and a driver (section 2.2.5)
» LR parsers covered in section 2.2.6
20
Recursive Descent Parser Example
• LL(1) grammar
21
Recursive Descent Parser Example
• Outline of
recursive parser
– match is
the scanner
22
Recursive Descent Parser Example
23
Recursive Descent Parser Example
24
Recursive Descent Parser Example
25
Semantic Analysis
Target Code
26