Professional Documents
Culture Documents
Compiler Design
Carl Wu
Three topics
• Syntax Grammar vs. AST
• Component(?)-based grammar
• Aspect-oriented grammar
Grammar vs. AST (I)
RCFG Specification
Stmt :: Block | IfStmt | AssignStmt
IfStmt :: “if” Exp “then” Stmt
AssignStmt :: IdUse “:=” Exp
Stmt
Grammar
Grammar Grammar
Grammar Grammar
Grammar Grammar
Grammar
Module
Module Module
Module Component
Component Component
Component
Grammar
Grammar Parser
Parser
Parser
Parser
Parser
Parser
Modularized Component-based
grammar grammar
Benefits
• Benefits from modularized grammar
– Easy to read, write, change
– Eliminate naming conflicts
• Additional benefits brought from component-
based grammar
– Each component can be designed, developed and
tested individually.
– Any change to certain component does not require
compiling all the other components.
– Different type of grammars/parsing algorithms can be
used for different component, e.g., one component
can be LL, one can be LALR.
Difficulty in designing component-
based grammar
• No clear guards between two components.
– Switch the control to a new parser or stay in the same?
– Suitable for embed languages, e.g., Jscript in Html
– Not suitable for an integral language, e.g., Java
• Two much coupling between two components.
– Not just reuse the component as a whole, may also
reuse the internal productions and symbols.
– Not applicable for LR parsers, once the table is built,
you can’t reuse the internal productions (no way to
jump into a table).
Ideal vs. reality
Object_type Array Object_type Array
Interface
Interface
Type
Type
Class
Class
Java
Java
Statement
Statement
Expression
Expression
Binary_expr
Binary_expr
Unary_expr
Unary_expr
Primary
Primary
Suggestions?
Aspect-oriented grammar
Aspect-oriented grammar
• Join-point: grammar patterns that crosscut
multiple productions
• Punctuations, identifiers, modifiers…
Example
• ";“ appears 25 times in one of the Java
grammars
• “.” appears 74 times in one of the Cobol
grammars
• Every one of them should be carefully
placed!
<Sentence> ::= <Accept Stm> '.' | <Open Stm> '.'
| <Add Stm> '.' | <Perform Stm> '.'
| <Add Stm Ex> <End-Add Opt> '.' | <Perform Stm Ex> <End-Perform Opt>
'.'
| <Call Stm> '.'
| <Read Stm> '.'
| <Call Stm Ex> <End-Call Opt> '.'
| <Read Stm Ex> <End-Read Opt> '.'
| <Close Stm> '.'
| <Release Stm> '.'
| <Compute Stm> '.'
| <Rewrite Stm> '.'
| <Compute Stm Ex> <End-Compute Opt>
'.' | <Rewrite Stm Ex> <End-Rewrite Opt> '.'
| <Display Stm> '.' | <Set Stm> '.'
| <Divide Stm> '.' | <Start Stm> '.'
| <Divide Stm Ex> <End-Divide Opt> '.' | <Start Stm Ex> <End-Start Opt> '.'
| <Evaluate Stm> <End-Evaluate Opt> '.' | <String Stm> '.'
| <If Stm> <End-If Opt>'.' | <String Stm Ex> <End-String Opt> '.'
| <Move Stm> '.' | <Subtract Stm>'.'
| <Move Stm Ex> <End-Move Opt> '.' | <Subtract Stm Ex> <End-Substract Opt>
'.'
| <Multiply Stm>'.'
| <Write Stm> '.'
| <Multiply Stm Ex> <End-Multiply Opt> '.'
| <Write Stm Ex> <End-Write Opt> '.'
| <Unstring Stm>'.'
| <Unstring Stm Ex> <End-Unstring Opt> '.'
| <Misc Stm> '.'
Result
Result
grammar
grammar
Parser
Parser
What do you think?