You are on page 1of 81

Small Interpreter

Dušan Kolář

Lecture on Small Interpreter Overview

Building a Small Interpreter Introduction

Lexical Analysis
Using flex and bison Possibilities
Token Description
flex/lex

Lecture Building a Small Interpreter on the 17th–19th April Syntax Analysis

2018 Syntax Constructs


yacc/bison

Adding String for Print

Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Dušan Kolář
Faculty of Information Technology
Brno University of Technology
Small Interpreter.1
Small Interpreter
Aim of the Lecture
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

1 flex Syntax Analysis


Syntax Constructs

2 bison yacc/bison

Adding String for Print


3 Utilities Adding Functions

4 Build that all Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.2
Small Interpreter
Contents
Dušan Kolář

1 Introduction

2 Lexical Analysis Overview

Possibilities Introduction

Lexical Analysis
Token Description Possibilities

flex/lex Token Description


flex/lex

Syntax Analysis
3 Syntax Analysis Syntax Constructs

Syntax Constructs yacc/bison

Adding String for Print


yacc/bison
Adding Functions

4 Adding String for Print Read a Number

Build a Tree
5 Adding Functions Add While Statement

Summary
6 Read a Number References

7 Build a Tree

8 Add While Statement

9 Summary

Small Interpreter.3
Small Interpreter
Interpreter Structure
Dušan Kolář

Overview

Introduction
Analyses Lexical Analysis
Possibilities

• Lexical Token Description


flex/lex

• Syntax Syntax Analysis


Syntax Constructs

• Semantic yacc/bison

Adding String for Print

Adding Functions

Read a Number

Synthesis Build a Tree

Add While Statement

• Immediate action: store computation to variable, print Summary

References
value, etc.

Small Interpreter.4
Small Interpreter
Our Language
Dušan Kolář

• Basic arithmetic expressions over floating point numbers Overview

Introduction
• Ternary operator Lexical Analysis

• Variables Possibilities
Token Description
flex/lex
• Printing
Syntax Analysis
• Case insensitive Syntax Constructs
yacc/bison

• Line oriented (empty lines allowed) Adding String for Print

Adding Functions

Read a Number

Build a Tree
Example
Add While Statement

Summary
LET a = 23*(5.6 + 256.8) References

Print a*2+6.1
LET b = a>5 ? 4.3 * a - 8.7 : a + 7.5

Small Interpreter.5
Small Interpreter
Possible Extensions
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities

Too simple? Will add: Token Description


flex/lex

1 Strings for print Syntax Analysis


Syntax Constructs

2 Right associative binary operator + built in functions yacc/bison

Adding String for Print


3 Statement for reading a number from input Adding Functions

4 Interpretation from internal representation Read a Number

Build a Tree
5 While statement addition
Add While Statement

Summary

References

Small Interpreter.6
Small Interpreter
Lexical Analyzer
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
How to implement: flex/lex

Syntax Analysis
• Direct implementation Syntax Constructs
yacc/bison

• State machine implementation Adding String for Print

Adding Functions
• Lexer constructor
Read a Number
• Possibly other ways. . . Build a Tree

Add While Statement

Summary

References

Small Interpreter.7
Small Interpreter
Lexical Elements
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
What to recognize? Token Description
flex/lex

• Identifiers/keywords (yes, keywords are identifiers) Syntax Analysis


Syntax Constructs

• Numbers — floating point ones yacc/bison

Adding String for Print


• White spaces and line ends Adding Functions

• Operators, e.g. +, ∗, <=, . . . Read a Number

Build a Tree
• Anything else is error, e.g. # Add While Statement

Summary

References

Small Interpreter.8
Small Interpreter
Lexical Element Description
Dušan Kolář

Identifier
Overview
It is a non-empty sequence of letters, numbers, and underscore
Introduction
symbol starting either with a letter or an underscore.
Lexical Analysis
Possibilities
Token Description
Regular expression flex/lex

Syntax Analysis

• Let [abc] is a set of symbols a, b, and c — it represents Syntax Constructs


yacc/bison

single occurrence of single one such a symbol Adding String for Print

• Then [abc]+ is non-empty sequence of symbols from the Adding Functions

Read a Number
set, e.g. a, bb, abc, . . .
Build a Tree
• Then [abc]* is possibly empty sequence of symbols Add While Statement

from the set Summary

• Let [a-z] represents set of all symbols from a to z References

• Let [a-z][0-9] represents concatenation in between


two sets/sequences
• THUS: [ a-zA-Z][ a-zA-Z0-9]*

Small Interpreter.9
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Definition of a floating point number Token Description
flex/lex

Floating point number is: Syntax Analysis

• Non-empty sequence of numbers (in a fact integer), let us Syntax Constructs


yacc/bison

call it DIGITS Adding String for Print

• DIGITS dot DIGITS, e.g. 3.14, 2.5 Adding Functions

Read a Number
• DIGITS plus exponent, e.g. 1e10, 2E-4 Build a Tree

• DIGITS dot DIGITS plus exponent, e.g. 1.2E+6, 7.1e-2 Add While Statement

Summary

References

Small Interpreter.10
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Integer Token Description
flex/lex

Syntax Analysis
• We can define a single DIGIT as [0-9] Syntax Constructs
yacc/bison
• Then an integer called DIGITS is {DIGIT}+ Adding String for Print

Adding Functions

Read a Number
Note: if we give a name to a regular expression, we should
Build a Tree
enclose it in curly braces for further use. Add While Statement

Summary

References

Small Interpreter.11
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities

Floating point number format 1 Token Description


flex/lex

Syntax Analysis
• Exact match of a string is a string, e.g. ”AM” Syntax Constructs
yacc/bison
• Let FLOAT1 is {DIGITS}"."{DIGITS} Adding String for Print

Adding Functions

Read a Number
Note: dot as such will represent any symbol except end of Build a Tree
line. Add While Statement

Summary

References

Small Interpreter.12
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Floating point number format 2 and 3 Possibilities
Token Description

• Notation [0-2]? represents zero or one occurrence of a flex/lex

Syntax Analysis
symbol in the set Syntax Constructs
yacc/bison
• Let EXP is [eE][-+]?{DIGITS} Adding String for Print

• Let FLOAT2 is {DIGITS}{EXP} Adding Functions

• Let FLOAT3 is {DIGITS}"."{DIGITS}{EXP} Read a Number

Build a Tree

Add While Statement

Note: if minus symbol is in a set it must be the first symbol. Summary

References

Small Interpreter.13
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex
White spaces and line end
Syntax Analysis
Syntax Constructs

• Let white space is [ \t\f\r], call it WSPC yacc/bison

Adding String for Print


• List of white spaces is {WSPC}+ Adding Functions

• End of line is simply [\n] Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.14
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Operators Token Description
flex/lex

• If we have two regular expressions A and B then notation Syntax Analysis


Syntax Constructs
{A}|{B} represents choice, either A or B yacc/bison

• Single letter operators are simply a set [-+*/=<>?:()!] Adding String for Print

Adding Functions
• Two letter operators can be represented as a choice Read a Number

among several possibilities Build a Tree

"=="|"!="|"<="|">="|"&&"|"||" Add While Statement

Summary

References

Small Interpreter.15
Small Interpreter
Lexical Element Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Anything else Syntax Analysis


Syntax Constructs
yacc/bison
• Everything else is an error Adding String for Print

• We will use dot for such a representation (as noted above) Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.16
Small Interpreter
About generators
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description

• Originally generating C code, today even C++ compliant flex/lex

Syntax Analysis
• Regular expressions to define tokens/symbols Syntax Constructs
yacc/bison

• Always parse the longest/first definition (if several overlap) Adding String for Print

Adding Functions

Read a Number
• Manual should be studied for deep understanding. Build a Tree

Add While Statement

Summary

References

Small Interpreter.17
Small Interpreter
Structure
Dušan Kolář

Overview

Introduction
Input text Lexical Analysis
Possibilities
Definitions of regular expressions Token Description
flex/lex
%%
Syntax Analysis
Definitions of tokens and actions Syntax Constructs
yacc/bison
%%
Adding String for Print
C code Adding Functions

Read a Number
Note Build a Tree

Using %{ and %} pair brackets we can insert C code into Add While Statement

Summary
regular expressions too, typically at the beginning, symbols
References
must be at the beginning of the line.

Small Interpreter.18
Small Interpreter
Demo
Dušan Kolář

Overview

Introduction

Lexical Analysis
Example of reg. exp. definitions Possibilities
Token Description
flex/lex
LETTER ([ a-zA-Z])
Syntax Analysis
DIGIT ([0-9]) Syntax Constructs
yacc/bison
DIGITS ({DIGIT}+)
Adding String for Print
EXP ([eE][-+]?{DIGITS})
Adding Functions
FLOAT1 ({DIGITS}"."{DIGITS}) Read a Number
FLOAT2 ({DIGITS}{EXP}) Build a Tree
FLOAT3 ({DIGITS}"."{DIGITS}{EXP}) Add While Statement

IDENT ({LETTER}({LETTER}|{DIGIT})*) Summary

References

Small Interpreter.19
Small Interpreter
Demo
Dušan Kolář

Overview

Introduction

Lexical Analysis
Example of tokens and actions Possibilities
Token Description

{WSPC} ; /* nothing to do, white space */ flex/lex

Syntax Analysis
Syntax Constructs

{FLOAT} { yacc/bison

Adding String for Print


sscanf(yytext,"%lf",
Adding Functions
&(yyFloat(yylval))); Read a Number
yyFlag(yylval) = fFLOAT; Build a Tree
return FLOAT; Add While Statement

} Summary

References

Small Interpreter.20
Small Interpreter
Demo
Dušan Kolář

Overview

Introduction
Example of insert code
Lexical Analysis
Possibilities
%{ Token Description

#include <stdio.h> flex/lex

Syntax Analysis
#include <string.h> Syntax Constructs
yacc/bison

Adding String for Print


#define KWLEN 2
Adding Functions
char *keywords[KWLEN] = {
Read a Number
"let", Build a Tree
"print", Add While Statement
}; Summary

%} References

Small Interpreter.21
Small Interpreter
Statements
Dušan Kolář

Overview

Introduction

Lexical Analysis
LET Possibilities
Token Description
It enables to assign a value of expression to the given variable. flex/lex

Syntax Analysis
Syntax Constructs
Sequence of yacc/bison

1 Keyword LET Adding String for Print

Adding Functions
2 Variable (variable name as an attribute) Read a Number

3 Assignment symbol = Build a Tree

Add While Statement


4 expression
Summary
5 End of line References

Small Interpreter.22
Small Interpreter
Statements
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
PRINT flex/lex

It prints to the standard output a value of the given expression. Syntax Analysis
Syntax Constructs
yacc/bison

Sequence of Adding String for Print

1 Keyword PRINT Adding Functions

Read a Number
2 expression Build a Tree

3 End of line Add While Statement

Summary

References

Small Interpreter.23
Small Interpreter
Statements
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

It seems like it is quite easy to recognize, as white spaces will Syntax Analysis
be skipped by lexical analyzer. Syntax Constructs
yacc/bison

Adding String for Print


Nevertheless, what about the expression? Operators, Adding Functions

precedence, brackets, etc. Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.24
Small Interpreter
Expression Representation
Dušan Kolář

Overview

Introduction
There are many possible representations, some of them are Lexical Analysis

feasible. Possibilities
Token Description

• Post-fix (Polish) notation — no brackets, priorities solved, flex/lex

Syntax Analysis
not simple to interpret Syntax Constructs
yacc/bison
• Stack code — e.g. Java Virtual Machine, similar to post-fix Adding String for Print
notation Adding Functions

• Three address code — no brackets, priorities solved, not Read a Number

Build a Tree
simple to create
Add While Statement
• Tree representation — no brackets, priorities solved, Summary
simple to traverse References

Small Interpreter.25
Small Interpreter
Tree Example 1 — 2*a
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

*
Syntax Analysis
Syntax Constructs
yacc/bison

Adding String for Print

Adding Functions

2 a Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.26
Small Interpreter
Tree Example 2 — b*(c+4)
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

* Syntax Analysis
Syntax Constructs
yacc/bison

b + Adding String for Print

Adding Functions

c 4
Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.27
Small Interpreter
Expression Description
Dušan Kolář

Overview

Introduction

Lexical Analysis
From such a representation a description of the expression can Possibilities
Token Description
be derived. flex/lex

Syntax Analysis
Definition Syntax Constructs
yacc/bison
A quadruple G = (N, T , P, S) is called context free grammar if Adding String for Print
N is a finite set of non-terminals, S is a starting non-terminal, Adding Functions

S ∈ N, T is a finite set of terminals, T ∩ N = ∅, P is a finite Read a Number

non-empty set of productions of the form A → α, where A ∈ N, Build a Tree

α ∈ (N ∪ T )∗ . Add While Statement

Summary

References

Small Interpreter.28
Small Interpreter
Generalization of Tree
Dušan Kolář

Overview
If we have an expression a ∗ (3 + b) and appropriate tree Introduction
representation: Lexical Analysis
Possibilities
Token Description
flex/lex

Syntax Analysis
Syntax Constructs
yacc/bison

* * Adding String for Print

Adding Functions

b + e + Read a Number

Build a Tree

Add While Statement

c 4 e e Summary

References

Small Interpreter.29
Small Interpreter
Generalization of Tree
Dušan Kolář

Overview

Introduction
We can see that leaves are variables, constants, operators are Lexical Analysis
inner nodes. Tree structure denotes priority and associativity. Possibilities
Token Description
So we say flex/lex

a ∗ (3 + b) Syntax Analysis
Syntax Constructs
yacc/bison
is Adding String for Print
e ∗ (e + e) Adding Functions

Read a Number
where e is an expression.
Build a Tree
More general, there are two expressions:
Add While Statement

Summary
e∗e e+e References

Small Interpreter.30
Small Interpreter
Generalization of Tree
Dušan Kolář

Overview

Introduction

Lexical Analysis
Thus, the additive and multiplicative operations can be Possibilities

described as: Token Description


flex/lex

e →e+e Syntax Analysis


e →e−e Syntax Constructs
yacc/bison
e →e∗e Adding String for Print
e → e/e Adding Functions

e → (e) Read a Number

e → VARIABLE Build a Tree

e → CONSTANT Add While Statement

Summary
We can easily extend to other operations.
References

Small Interpreter.31
Small Interpreter
Priority and Associativity
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
On the grammar level, two clumsy to solve. flex/lex

Basically, we have endless hierarchy in priority. Syntax Analysis


Syntax Constructs
Fortunately, just three associativies: yacc/bison

• Left — e.g. plus, minus Adding String for Print

Adding Functions
• Right — e.g. unary minus
Read a Number
• No associativity — e.g comparison greater than Build a Tree

Add While Statement

Summary

References

Small Interpreter.32
Small Interpreter
Priority and Associativity
Dušan Kolář

Overview

Introduction

Lexical Analysis
Rules for associativity: Possibilities
Token Description

• %left token name(s) flex/lex

Syntax Analysis
• %right token name(s) Syntax Constructs
yacc/bison

• %nonassoc token name(s) Adding String for Print

For other terminals %term Adding Functions

Read a Number

Build a Tree
Priority is given by the order, the higher is the line number, the Add While Statement
higher is the priority. Summary

References

Small Interpreter.33
Small Interpreter
Our Language
Dušan Kolář

Overview

Introduction

Lexical Analysis
Operator priority and associativity Possibilities
Token Description
flex/lex
%left OR
Syntax Analysis
%left AND Syntax Constructs
yacc/bison
%nonassoc EQ NE
Adding String for Print
%nonassoc LE GE ’<’ ’>’ Adding Functions
%left ’+’ ’-’ Read a Number
%left ’*’ ’/’ Build a Tree
%right ’!’ Add While Statement

%right UNOP Summary

References

Small Interpreter.34
Small Interpreter
Priority Change
Dušan Kolář

Overview

Introduction

Lexical Analysis
Override priority and associativity like in bison Possibilities
Token Description
e →e+e flex/lex

e →e−e Syntax Analysis


Syntax Constructs
e →e∗e yacc/bison

e → e/e Adding String for Print

e → −e %prec UNOP Adding Functions

Read a Number
e → +e %prec UNOP
Build a Tree
e → (e)
Add While Statement
e → VARIABLE Summary
e → CONSTANT References

Small Interpreter.35
Small Interpreter
Expression in yacc/bison Notation
Dušan Kolář

expr
: expr OR expr
Overview
| expr AND expr Introduction
| expr EQ expr Lexical Analysis
| expr NE expr Possibilities
Token Description
| expr LE expr flex/lex

| expr GE expr Syntax Analysis


Syntax Constructs
| expr ’<’ expr yacc/bison

| expr ’>’ expr Adding String for Print

Adding Functions
| expr ’+’ expr
Read a Number
| expr ’-’ expr
Build a Tree
| expr ’*’ expr Add While Statement
| expr ’/’ expr Summary
| ’!’ expr References
| ’+’ expr %prec UNOP
| ’-’ expr %prec UNOP
| ’(’ ternary ’)’
| FLOAT
| IDENT
Small Interpreter.36
Small Interpreter
Ternary Operator
Dušan Kolář

Overview

Introduction

• The lowest priority Lexical Analysis


Possibilities

• Due to 3 (ternary) parameters it is better to solve on Token Description


flex/lex

grammar level Syntax Analysis


Syntax Constructs
• Will produce a warning — it is fine we bind closer parts yacc/bison

Adding String for Print

Adding Functions
Grammar part: Read a Number
ternary Build a Tree

: ternary ’?’ ternary ’:’ ternary Add While Statement

| expr Summary

References
;

Small Interpreter.37
Small Interpreter
Structure — Similar to lex/flex
Dušan Kolář

Overview

Introduction
Input text Lexical Analysis
Possibilities
Definitions of tokens and attributes Token Description
flex/lex
%%
Syntax Analysis
Definition of grammar Syntax Constructs
yacc/bison
%%
Adding String for Print
C code Adding Functions

Read a Number
Note Build a Tree

Using %{ and %} pair brackets we can insert C code into Add While Statement

Summary
regular expressions too, typically at the beginning, symbols
References
must be at the beginning of the line.

Small Interpreter.38
Small Interpreter
Tokens
Dušan Kolář

Overview

Introduction
%left OR Lexical Analysis
%left AND Possibilities
Token Description
%nonassoc EQ NE flex/lex

%nonassoc LE GE ’<’ ’>’ Syntax Analysis


Syntax Constructs
%left ’+’ ’-’ yacc/bison

%left ’*’ ’/’ Adding String for Print

%right ’!’ Adding Functions

%right UNOP Read a Number

Build a Tree

Add While Statement


%token <s> FLOAT IDENT
Summary
%term LET PRINT References
%type <s> expr ternary

Small Interpreter.39
Small Interpreter
Data Structure
Dušan Kolář

Overview

Data associated with a token or non-terminal, e.g. Introduction

sub-expression value, constant value, identifier name. Lexical Analysis


Possibilities
Token Description
flex/lex
Belongs to the first part, among definitions: Syntax Analysis
%union { Syntax Constructs
yacc/bison
struct sStackType {
Adding String for Print
unsigned char flag; Adding Functions
union { Read a Number
double vFloat; Build a Tree

char *vStr; Add While Statement

} u; Summary

} s; References

Small Interpreter.40
Small Interpreter
Remaining Part of the Grammar
Dušan Kolář

Overview
prog Introduction
: progelem Lexical Analysis

| prog progelem Possibilities


Token Description

; flex/lex

Syntax Analysis
Syntax Constructs
progelem yacc/bison

: statement ’\n’ Adding String for Print

Adding Functions
| ’\n’
Read a Number
;
Build a Tree

Add While Statement


statement Summary
: LET IDENT ’=’ ternary References

| PRINT ternary
;

Small Interpreter.41
Small Interpreter
Computation
Dušan Kolář

Overview

Introduction

Evaluation via semantic action Lexical Analysis


Possibilities

Semantic action is the C code, which is performed when part of Token Description
flex/lex

the expression has been analyzed. Syntax Analysis


Syntax Constructs
yacc/bison

Adding String for Print


Appropriate parts: Adding Functions

• Addition of two expressions is parsed Read a Number

Build a Tree
• Statement is parsed
Add While Statement
• Constant/Variable is recognized Summary

• Etc. References

Small Interpreter.42
Small Interpreter
Expression
Dušan Kolář

We add action to each operation, e.g.: Overview

Introduction
| expr ’>’ expr
Lexical Analysis
{ Possibilities

$$.flag = fFLOAT; Token Description


flex/lex
if ($1.u.vFloat > $3.u.vFloat) { Syntax Analysis
$$.u.vFloat = 1.0; Syntax Constructs
yacc/bison
} else { Adding String for Print
$$.u.vFloat = 0.0; Adding Functions

} Read a Number

} Build a Tree

| expr ’+’ expr Add While Statement

{ Summary

References
$1.u.vFloat += $3.u.vFloat;
$$ = $1;
}

Small Interpreter.43
Small Interpreter
Expression
Dušan Kolář

Overview

We add action to each operation, e.g.: Introduction

| ’(’ ternary ’)’ Lexical Analysis


Possibilities
{ Token Description
flex/lex
$$ = $2; Syntax Analysis
} Syntax Constructs
yacc/bison
| FLOAT
Adding String for Print
/* $$ = $1 */ Adding Functions
| IDENT Read a Number
{ Build a Tree

$$.flag = fFLOAT; Add While Statement

$$.u.vFloat = read($1.u.vStr); Summary

xfree($1.u.vStr); References

Small Interpreter.44
Small Interpreter
Statement
Dušan Kolář

Overview

Introduction
We add action to each statement: Lexical Analysis
statement Possibilities
Token Description
: LET IDENT ’=’ ternary flex/lex

{ Syntax Analysis
Syntax Constructs
insertModify($2.u.vStr,$4.u.vFloat); yacc/bison

xfree($2.u.vStr); Adding String for Print

} Adding Functions

| PRINT ternary Read a Number

Build a Tree
{
Add While Statement
printf("%g\n",$2.u.vFloat);
Summary
} References
;

Small Interpreter.45
Small Interpreter
Our Language
Dušan Kolář

Inserted code for bison


%{ Overview

Introduction
#include <stdio.h>
Lexical Analysis
#include <stdlib.h> Possibilities

#include <string.h> Token Description


flex/lex

#include "stduse.h" Syntax Analysis


#include "symtab.h" Syntax Constructs
yacc/bison
#include "token.h" Adding String for Print

Adding Functions

#define IN PARSER Read a Number

#include "inter01.h" Build a Tree

Add While Statement

Summary
int yylex(void);
References
extern int yylineno;

FILE *fIn;
int yyerror(char *str);
%}
Small Interpreter.46
Small Interpreter
Our Language
Dušan Kolář
C code
int yyerror(char *str) {
Overview
prError(yylineno,"%s\n",str,NULL);
Introduction
return 1; Lexical Analysis
} Possibilities
Token Description
extern FILE *yyin; flex/lex

int main(int argc, char *argv[]) { Syntax Analysis


Syntax Constructs
exitOnError(); yacc/bison

...... test arguments, open file ...... Adding String for Print

yyin = fIn; Adding Functions

setFilename( argv[1] ); Read a Number

Build a Tree

Add While Statement


if (yyparse() != 0) {
Summary
fclose(fIn); References
prError(yylineno,"Aborted\n",NULL);
}

fclose(fIn);
return 0;
} Small Interpreter.47
Small Interpreter
Building the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Command line commands: Token Description
flex/lex

bison --defines -v inter01.y -o inter01.c Syntax Analysis


flex -o inter01.lex.c inter01.l Syntax Constructs
yacc/bison
gcc inter01.c inter01.lex.c symtab.c stduse.c Adding String for Print
-o x inter01 Adding Functions

Read a Number

Build a Tree
Note: It is for Linux/BSD machines, for Windows, use proper Add While Statement
name of executable, e.g. x inter01.exe . Summary

References

Small Interpreter.48
Small Interpreter
Adding String for Print
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

What to add? Syntax Analysis


Syntax Constructs
yacc/bison
• PRINT "A message" Adding String for Print

• PRINT "A value of something is " expression Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.49
Small Interpreter
flex
Dušan Kolář

Overview
Regular expression
Introduction

STRSTART (["]) Lexical Analysis


Possibilities
Token Description
flex/lex

Syntax Analysis
Definition of token and action Syntax Constructs
yacc/bison
{STRSTART} { Adding String for Print
yyStr(yylval) = readStr(); Adding Functions
yyFlag(yylval) = fSTR; Read a Number

return STR; Build a Tree

} Add While Statement

Summary

References
Note: function reads the characters using input() until end of
string is reached od error occurs. It stores the string
dynamically (re)allocated buffer. It can be at the end of the file.

Small Interpreter.50
Small Interpreter
bison
Dušan Kolář

Token
Overview
%token <s> STR
Introduction

Lexical Analysis
Possibilities
Token Description
Statement extension by flex/lex

Syntax Analysis
| PRINT STR Syntax Constructs

{ yacc/bison

Adding String for Print


puts($2.u.vStr);
Adding Functions
xfree($2.u.vStr); Read a Number
} Build a Tree
| PRINT STR ternary Add While Statement

{ Summary

printf("%s%g\n",$2.u.vStr,$3.u.vFloat); References

xfree($2.u.vStr);
}

Note: token.h file needs a new tag fStr.

Small Interpreter.51
Small Interpreter
Build the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Syntax Analysis
Syntax Constructs
Use commands from previous version, missing C code is in yacc/bison

ML05 folder, just find it :-) Adding String for Print

Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.52
Small Interpreter
Adding Functions and Operators
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

What to add? Syntax Analysis


Syntax Constructs
yacc/bison
• Functions: sin, cos, tan Adding String for Print

• Power operator (binary right associative): ˆ Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.53
Small Interpreter
flex
Dušan Kolář

Keyword table extension: Overview

#define KWLEN 5 Introduction

char *keywords[KWLEN] = { Lexical Analysis


Possibilities

"cos", Token Description


flex/lex
"let", Syntax Analysis
"print", Syntax Constructs
yacc/bison
"sin", Adding String for Print
"tan", Adding Functions
}; Read a Number

unsigned keycodes[KWLEN] = { Build a Tree

COS, Add While Statement

LET, Summary

References
PRINT,
SIN,
TAN,
};

Small Interpreter.54
Small Interpreter
flex
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
Regular expression flex/lex

Syntax Analysis
OP1 ([-+*/=<>?:()!ˆ]) Syntax Constructs
yacc/bison

Adding String for Print

Adding Functions
Definition of token and action Read a Number

Nothing to do. Build a Tree

Add While Statement

Summary

References

Small Interpreter.55
Small Interpreter
bison
Dušan Kolář

Overview
Token
Introduction
%term SIN COS TAN Lexical Analysis

%right ’ˆ’ Possibilities


Token Description

Just one line ahead of unary ! . flex/lex

Syntax Analysis
Syntax Constructs
yacc/bison

Adding String for Print


Expression extension also by
Adding Functions

| expr ’ˆ’ expr Read a Number

{ Build a Tree

Add While Statement


$$.u.vFloat =
Summary
pow($1.u.vFloat,$3.u.vFloat);
References
$$.flag = fFLOAT;
{

Small Interpreter.56
Small Interpreter
bison
Dušan Kolář

Expression extension also by


Overview
| SIN ’(’ ternary ’)’ Introduction

{ Lexical Analysis
Possibilities
$$.flag = fFLOAT; Token Description

$$.u.vFloat = sin($3.u.vFloat); flex/lex

Syntax Analysis
} Syntax Constructs

| COS ’(’ ternary ’)’ yacc/bison

Adding String for Print


{
Adding Functions
$$.flag = fFLOAT;
Read a Number
$$.u.vFloat = cos($3.u.vFloat); Build a Tree
} Add While Statement
| TAN ’(’ ternary ’)’ Summary

{ References

$$.flag = fFLOAT;
$$.u.vFloat = tan($3.u.vFloat);
}

Small Interpreter.57
Small Interpreter
Building the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Command line commands: Syntax Analysis


Syntax Constructs
bison --defines -v inter03.y -o inter03.c yacc/bison

flex -o inter03.lex.c inter03.l Adding String for Print

gcc inter03.c inter03.lex.c symtab.c stduse.c Adding Functions

Read a Number
-o x inter03 -lm
Build a Tree

Add While Statement

Summary

References

Small Interpreter.58
Small Interpreter
Adding Statement to Read a Number
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

What to add? Syntax Analysis


Syntax Constructs
yacc/bison
• READ variable Adding String for Print

• READ "Give value for X: " variable Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.59
Small Interpreter
flex
Dušan Kolář

Keyword table extension:


Overview
#define KWLEN 5
Introduction
char *keywords[KWLEN] = {
Lexical Analysis
"cos", Possibilities
Token Description
"let", flex/lex

"print", Syntax Analysis


Syntax Constructs
"read", yacc/bison

"sin", Adding String for Print

"tan", Adding Functions

}; Read a Number

unsigned keycodes[KWLEN] = { Build a Tree

Add While Statement


COS,
Summary
LET,
References
PRINT,
READ,
SIN,
TAN,
};
Small Interpreter.60
Small Interpreter
bison
Dušan Kolář
Token
%term READ
Overview

Introduction

Statement extension by Lexical Analysis


Possibilities
Token Description
| READ IDENT flex/lex

{ Syntax Analysis
Syntax Constructs
double rval; yacc/bison

scanf("%lf",&rval); Adding String for Print

insertModify($2.u.vStr,rval); Adding Functions

xfree($2.u.vStr); Read a Number

} Build a Tree

Add While Statement


| READ STR IDENT
Summary
{
References
double rval;
printf("%s",$2.u.vStr);
scanf("%lf",&rval);
insertModify($3.u.vStr,rval);
xfree($2.u.vStr); xfree($3.u.vStr);
}
Small Interpreter.61
Small Interpreter
Build the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Syntax Analysis
Syntax Constructs

Use commands from previous version. yacc/bison

Adding String for Print

Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.62
Small Interpreter
Change from Direct to Indirect Interpretation
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
What to add? Token Description
flex/lex

• Tree representation for whole program Syntax Analysis


Syntax Constructs
• Evaluation of the tree-based program yacc/bison

Adding String for Print

Adding Functions

What to change? Read a Number

Build a Tree

• Actions in parser to build a tree Add While Statement

Summary

References

Small Interpreter.63
Small Interpreter
A Type for Tree Structure
Dušan Kolář

Overview

Introduction
Statement extension by
Lexical Analysis

typedef struct ast s { Possibilities


Token Description

unsigned tag; flex/lex

Syntax Analysis
unsigned lineno; Syntax Constructs

union { yacc/bison

struct { Adding String for Print

Adding Functions
struct ast s *lft, *rgt;
Read a Number
} ptr;
Build a Tree
char *sVal; Add While Statement
double dVal; Summary
} u; References

} ast t;

Small Interpreter.64
Small Interpreter
Operations with the Tree
Dušan Kolář

We need creation, evaluation: Overview

Introduction
• ast t *mkSlf(unsigned tag, char *str); Lexical Analysis
creates leaf with string value and the given tag. Possibilities
Token Description

• ast t *mkDlf(unsigned tag, double dval); flex/lex

Syntax Analysis
creates leaf with double value and the given tag. Syntax Constructs
yacc/bison
• ast t*
Adding String for Print
mkNd(unsigned tag, ast t *l, ast t *r); Adding Functions
creates inner node with left and right sub-tree and the Read a Number

given tag. Build a Tree

• ast t* Add While Statement

Summary
appR(unsigned tag, ast t *lst, ast t *nd);
References
appends node to the right (end) of the list of nodes.
• void evaluate(ast t *root);
evaluates the program.

Small Interpreter.65
Small Interpreter
Expression Evaluation Fragment
Dušan Kolář

Overview

static double expr(ast t *root) Introduction

Lexical Analysis
switch (tag(root)) { Possibilities
Token Description
case ’?’: flex/lex

if (expr(left(root)) != 0.0) { Syntax Analysis


Syntax Constructs
return expr( left(right(root)) ); yacc/bison

} else { Adding String for Print

return expr( right(right(root)) ); Adding Functions

} Read a Number

case ’+’: Build a Tree

Add While Statement


return expr(left(root))
Summary
+ expr(right(root));
References
case SIN:
return sin( expr(left(root)) );

Small Interpreter.66
Small Interpreter
Statement Evaluation Fragment
Dušan Kolář

static void proc(ast t *root) Overview

switch (tag(root)) { Introduction

Lexical Analysis
case ’=’: Possibilities

insertModify( sv(left(root)), Token Description


flex/lex
expr(right(root)) ); Syntax Analysis
break; Syntax Constructs
yacc/bison
case PRINT: Adding String for Print
if (left(root) == NULL) { Adding Functions
printf("%g\n", expr(right(root)) ); Read a Number

} else if (right(root) == NULL) { Build a Tree

puts( sv(left(root)) ); Add While Statement

} else { Summary

References
printf("%s%g\n", sv(left(root)),
expr(right(root)) );
}
break;

Small Interpreter.67
Small Interpreter
bison
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
Token flex/lex

Syntax Analysis
No need to modify. Syntax Constructs
yacc/bison

Adding String for Print

Adding Functions
Grammar Read a Number

No need to modify. Build a Tree

Add While Statement

Summary

References

Small Interpreter.68
Small Interpreter
bison
Dušan Kolář

Overview
Actions should be reworked Introduction

All of them. . . Lexical Analysis


Possibilities
Token Description
flex/lex
New data structure Syntax Analysis

%union { Syntax Constructs


yacc/bison

struct sStackType { Adding String for Print

unsigned char flag; Adding Functions

union { Read a Number

Build a Tree
double vFloat;
Add While Statement
char *vStr; Summary
struct ast s *ast; References
} u;
} s; }

Small Interpreter.69
Small Interpreter
bison
Dušan Kolář

Expression fragment Overview

| expr ’-’ expr Introduction

{ Lexical Analysis
Possibilities

$$.flag = fAST; Token Description


flex/lex
$$.u.ast = mkNd(’-’, $1.u.ast, $3.u.ast); Syntax Analysis
} Syntax Constructs
yacc/bison
| FLOAT Adding String for Print
{ Adding Functions
$$.flag = fAST; Read a Number

$$.u.ast = mkDlf(FLOAT,$1.u.vFloat); Build a Tree

} Add While Statement

| SIN ’(’ ternary ’)’ Summary

{ References

$$.flag = fAST;
$$.u.ast = mkNd(SIN,$3.u.ast,NULL);
}

Small Interpreter.70
Small Interpreter
bison
Dušan Kolář

Overview

Statement fragment Introduction

Lexical Analysis
: LET IDENT ’=’ ternary Possibilities

{ Token Description
flex/lex

$$.flag = fAST; Syntax Analysis


Syntax Constructs
$$.u.ast = mkNd(’=’, yacc/bison

mkSlf(IDENT,$2.u.vStr), $4.u.ast); Adding String for Print

} Adding Functions

| PRINT STR ternary Read a Number

{ Build a Tree

Add While Statement


$$.flag = fAST;
Summary
$$.u.ast = mkNd(PRINT,
References
mkSlf(STR,$2.u.vStr), $3.u.ast);
}

Small Interpreter.71
Small Interpreter
Building the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Command line commands: Syntax Analysis


Syntax Constructs
bison --defines -v inter05.y -o inter05.c yacc/bison

flex -o inter05.lex.c inter05.l Adding String for Print

gcc inter05.c inter05.lex.c symtab.c stduse.c Adding Functions

Read a Number
astree.c -o x inter05 -lm
Build a Tree

Add While Statement

Summary

References

Small Interpreter.72
Small Interpreter
Adding Statement for While Lopp
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

What to add? Syntax Analysis


Syntax Constructs
yacc/bison
• WHILE ternary ’\n’ body END ’\n’ Adding String for Print

where body is a list/sequence of statements. Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.73
Small Interpreter
flex
Dušan Kolář

Keyword table extension:


Overview
#define KWLEN 7
Introduction
char *keywords[KWLEN] = {
Lexical Analysis
"cos", Possibilities
Token Description
"end", flex/lex

"let", Syntax Analysis


Syntax Constructs
... yacc/bison

"tan", Adding String for Print

"while", Adding Functions

}; Read a Number

unsigned keycodes[KWLEN] = { Build a Tree

Add While Statement


COS,
Summary
END,
References
LET,
...
TAN,
WHILE,
};
Small Interpreter.74
Small Interpreter
bison
Dušan Kolář

Overview

Introduction
Token Lexical Analysis
Possibilities
%term END WHILE Token Description

%type <s> body flex/lex

Syntax Analysis
Syntax Constructs
yacc/bison

Adding String for Print


Grammar change — extend progelem
Adding Functions

| WHILE ternary ’\n’ body END ’\n’ Read a Number

{ Build a Tree

Add While Statement


$$.flag = fAST;
Summary
$$.u.ast = mkNd(WHILE,$2.u.ast,$4.u.ast);
References
}

Small Interpreter.75
Small Interpreter
bison
Dušan Kolář

Overview

Introduction
Grammar change — add body 1
Lexical Analysis
Possibilities
body Token Description

: statement ’\n’ flex/lex

{ Syntax Analysis
Syntax Constructs

$$.flag = fAST; yacc/bison

Adding String for Print


$$.u.ast = appR(’;’, NULL, $1.u.ast);
Adding Functions
}
Read a Number
| ’\n’ Build a Tree
{ Add While Statement
$$.flag = fAST; Summary
$$.u.ast = NULL; References

Small Interpreter.76
Small Interpreter
bison
Dušan Kolář

Overview

Introduction
Grammar change — add body 2 Lexical Analysis
Possibilities

| body statement ’\n’ Token Description


flex/lex
{ Syntax Analysis
$$.flag = fAST; Syntax Constructs
yacc/bison
$$.u.ast = appR(’;’, $1.u.ast, $2.u.ast); Adding String for Print
} Adding Functions
| body ’\n’ Read a Number

{ Build a Tree

$$ = $1; Add While Statement

} Summary

References
;

Small Interpreter.77
Small Interpreter
astree.c
Dušan Kolář

Overview

Introduction
static void proc(ast t *root) Lexical Analysis
Possibilities

case WHILE: Token Description


flex/lex
{ Syntax Analysis
double ctrl = expr(left(root)); Syntax Constructs
yacc/bison
while (ctrl) { Adding String for Print
if (right(root) != NULL) Adding Functions
evaluate(right(root)); Read a Number

ctrl = expr(left(root)); Build a Tree

} Add While Statement

} Summary

References
break;

Small Interpreter.78
Small Interpreter
Build the Stuff
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Syntax Analysis
Syntax Constructs

Use commands from previous version. yacc/bison

Adding String for Print

Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.79
Small Interpreter
What We’ve Learned
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
1 Basic structure of interpreter flex/lex

2 How to build lexical analyzer with lex/flex Syntax Analysis


Syntax Constructs

3 How to represent expressions yacc/bison

Adding String for Print


4 How to build parser/syntax analyzer with bison Adding Functions

5 How to make simple interpreter Read a Number

Build a Tree
6 Hot to make not so simple interpreter Add While Statement

Summary

References

Small Interpreter.80
Small Interpreter
References
Dušan Kolář

Overview

Introduction

Lexical Analysis
Possibilities
Token Description
flex/lex

Syntax Analysis
Manual pages of bison, flex Syntax Constructs
yacc/bison

Aho, Sethi, Ullman: Compilers: Principles, Techniques, Adding String for Print

and Tools Adding Functions

Read a Number

Build a Tree

Add While Statement

Summary

References

Small Interpreter.81

You might also like