Getting Started with YACC

Structure of a YACC File • Has the same three-part structure that Lex uses. • Each part is seperated by a %% symbol. • The three parts are even identical: – Definition section – Rules section – Code section (copied directly into the generated program) .

The Definiton Section • Declare the tokens used in the grammar and the types of values used on the stack here. • Tokens that are single quoted characters like “=“ or “+” do not need to be declared. • Literal C code can be included in a block in this section using %{…%} .

• This is done by including lines like the one below in the definition section: %token CHARACTERSTRING INTEGER IDENTIFIER .Declaring Tokens • The tokens that are used in the grammar must be declared.

The Rules Section • The rules of the grammar are placed here. YACC grammar definition . • Here is an example of the basic syntax: Expr  INTEGER + INTEGER | INTEGER .INTEGER expr : INTEGER + INTEGER {action} | INTEGER – INTEGER {action} .

• These are usually included after the production whose action is to be defined. actions can be defined that will be performed whenever a production is applied in the stream of tokens. . • Since every symbol in the grammar has a corresponding value. • Accessing the YACC stack will be the way to do this. it will be necessary to access those values.YACC Actions • Simiar to Lex.

include a dollar sign with a number to get at each value in the production in the action definition. . it will push the symbols that it reads along with their values on a stack until it is ready to reduce.Accessing the Stack • Since YACC generates an LR parser. • To access these values.

.Accessing the Stack Refers to the value of the left nonterminal expr : INTEGER + INTEGER {$$ = $1 + $3} | INTEGER – INTEGER {$$ = $1 .$3} .

Where do Tokens and Their Values Come From? • Typically from the lexer. YACC LEX yyparse yylex .

.tab. • The actions for the rules need to be changed too.Revisiting Lex • The Lex file will have to be modified to work with the YACC parser in two main places. • In the definition section. include this statement: #include “y.h” • That is a header file automatically created by YACC when the parser is generated.

• Include a return statement for the token name (this is the same name that is defined at the top of the YACC file). return INTEGER.Revisiting Lex Actions • For tokens with a value. if [1-9][0-9]* {return IF. YACC can read the value from that variable.} .} {yylval = atoi(yytext). assign that value to yylval.

The %union Declaration • Different tokens have different data types. • INTEGER are integers. CHARACTERSTRING are char *. • The %union will allow the parser to apply the right data type to the right token. IDENTIFIER are pointers to the entry in the symbol table for that identifier. . FLOAT are floats.

intValue = atoi(yytext). } %token <intValue> INTEGER %token <floatValue> FLOAT Lex Rules Section … {yylval.floatValue = atof(yytext). float floatValue.} … {yylval. return FLOAT.The %union Declaration YACC Definition Section %union { int intValue.} . return INTEGER.

“Lex & Yacc”.cs. (2Ed . er. http://www.html#toc6 . Manson T . O'Reilly.htm • Bert Hubert. Brown D. “Lex and YACC primer/HOWTO”. “Yacc: Yet Another Compiler-Compiler”.utexas.References That Might Be Useful • Levine J R . 1992) • Stephen C. Johnson.

Sign up to vote on this title
UsefulNot useful