Professional Documents
Culture Documents
2/17/2012 CAPSL 1
GENERAL COMPILER
OVERVIEW
2/17/2012 CAPSL 2
Compiler Overview
2/17/2012 CAPSL 3
Lexer/Scanner
• Lexical Analysis
– process of converting a sequence of characters
into a sequence of tokens.
Lexeme Token Type
foo Variable
= Assignment Operator
foo = 1 - 3**2 1 Number
- Subtraction Operator
3 Number
** Power Operator
2 Number
2/17/2012 CAPSL 4
Parser
• Syntactic Analysis
– The process of analyzing a sequence of tokens to determine its
grammatical structure.
– Syntax errors are identified during this stage.
2 Number
2/17/2012 CAPSL 5
Semantic Analyzer
• Semantic Analysis
– The process of performing semantic checks.
• E.g. type checking, object binding, etc.
2/17/2012 CAPSL 6
Optimizer(s)
• Compiler Optimizations
– tune the output of a compiler to minimize or
maximize some attributes of an
executable computer program.
– Make programs faster, etc…
2/17/2012 CAPSL 7
Code Generator
• Code Generation
– process by which a compiler's code generator
converts some intermediate representation of
source code into a form (e.g., machine code)
that can be readily executed by a machine.
foo:
int foo()
addiu $sp, $sp, -16
{
addiu $2, $zero, 345
return 345;
addiu $sp, $sp, 16
}
jr $ra
2/17/2012 CAPSL 8
LEX/FLEX AND
YACC/BISON OVERVIEW
2/17/2012 CAPSL 9
General Lex/Flex Information
• lex
– is a tool to generator lexical analyzers.
– It was written by Mike Lesk and Eric Schmidt
(the Google guy).
– It isn’t used anymore.
• flex (fast lexical analyzer generator)
– Free and open source alternative.
– You’ll be using this.
2/17/2012 CAPSL 10
General Yacc/Bison Information
• yacc
– Is a tool to generate parsers (syntactic
analyzers).
– Generated parsers require a lexical analyzer.
– It isn’t used anymore.
• bison
– Free and open source alternative.
– You’ll be using this.
2/17/2012 CAPSL 11
Lex/Flex and Yacc/Bison relation to a
compiler toolchain
Lex/Flex Yacc/Bison
(.l spec file) (.y spec file)
2/17/2012 CAPSL 12
FLEX IN DETAIL
2/17/2012 CAPSL 13
How Flex Works
2/17/2012 CAPSL 14
Flex .l specification file
/*** Definition section ***/
%{
/* C code to be copied verbatim */
%}
%%
/*** Rules section ***/
%%
/*** C Code section ***/
2/17/2012 CAPSL 15
Flex Rule Format
2/17/2012 CAPSL 16
Flex Regex Matching Rules
• Flex matches the token with the longest match:
– Input: abc
– Rule: [a-z]+
Token: abc(not "a" or "ab")
• Flex uses the first applicable rule:
– Input: post
– Rule1: "post" { printf("Hello,"); }
– Rule2: [a-zA-z]+ { printf ("World!"); }
It will print Hello, (not “World!”)
2/17/2012 CAPSL 17
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 18
Flex Example
Match one or more
[0-9]+ { characters between 0-9.
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 19
Flex Example
[0-9]+ {
/*Code*/ Store the
yylval.dval = atof(yytext); Number.
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 20
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
Return the token type.
}
Declared in the .y file.
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 21
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
Match one or more
[A-Za-z]+ { alphabetical characters.
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 22
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
Store the
[A-Za-z]+ { text.
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
. { return yytext[0]; }
2/17/2012 CAPSL 23
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD; Return the token type.
} Declared in the .y file.
. { return yytext[0]; }
2/17/2012 CAPSL 24
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
Match yylval.symp = sp;
any single return WORD;
character }
. { return yytext[0]; }
2/17/2012 CAPSL 25
Flex Example
[0-9]+ {
/*Code*/
yylval.dval = atof(yytext);
return NUMBER;
}
[A-Za-z]+ {
/*Code*/
struct symtab *sp = symlook(yytext);
yylval.symp = sp;
return WORD;
}
Return the character. No
. { return yytext[0]; } need to create special
symbol for this case.
2/17/2012 CAPSL 26
BISON IN DETAIL
2/17/2012 CAPSL 27
How Bison Works
2/17/2012 CAPSL 28
What is a Grammar?
• A grammar
– is a set of formation rules for strings in a
formal language. The rules describe how to
form strings from the language's alphabet
(tokens) that are valid according to the
language's syntax.
2/17/2012 CAPSL 29
Simple Example Grammar
E E + E
E - E
E * E
E / E
id
2/17/2012 CAPSL 30
Simple Example Grammar
These are
E E + E productions
E - E
E * E
E / E
id
2/17/2012 CAPSL 31
Simple Example Grammar
E E + E
E - E
E * E
E / E
id
2/17/2012 CAPSL 32
Simple Example Grammar
E E + E
E - E
E * E
E / E
id
2/17/2012 CAPSL 33
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 34
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 35
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 36
Simple Example Grammar
1 Number id
2+2-1
2/17/2012 CAPSL 37
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 38
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 39
Simple Example Grammar
2+2-1
2/17/2012 CAPSL 40
Bison .y specification file
/*** Definition section ***/
%{ /* C code to be copied verbatim */ %}
%%
/*** Rules section ***/
statement_list: statement '\n'
| statement_list statement '\n'
expression: NUMBER
| NAME { $$ = $1->value; }
%%
/*** C Code section ***/
2/17/2012 CAPSL 41
Bison: definition Section Example
2/17/2012 CAPSL 42
Bison: definition Section Example
2/17/2012 CAPSL 43
Bison: definition Section Example
2/17/2012 CAPSL 44
Bison: definition Section Example
2/17/2012 CAPSL 45
Bison: definition Section Example
2/17/2012 CAPSL 46
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 47
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 48
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 49
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 50
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 51
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 52
Bison: rules Section Example
expression: NUMBER
| NAME { $$ = $1->value; }
2/17/2012 CAPSL 53
ABOUT YOUR
ASSIGNMENT
2/17/2012 CAPSL 54
What you need to do
2/17/2012 CAPSL 55
Hints
2/17/2012 CAPSL 56
Credit
• Wikipedia
– Most of the content is from or based off of
information from here.
• Wookieepedia
– Nothing was taken from here. *
2/17/2012 CAPSL 57