You are on page 1of 3

CSCE 4650/5650 Assignment 1

Due: 11:59 PM on Friday, February 16, 2024

Consider the following extended BNF grammar for a subset of the Java programming language, called
TinyJava.

program ::= import java.util.* ; class-definition {class-definition}


class-definition ::= class class-identifier { {member-list} }
member-list ::= member-declaration {member-declaration}
member-declaration ::= member-declarator ; | function-definition
member-declarator ::= [static] variable-declaration
function-definition ::= function-declaration {
{ variable-declaration ;}
[statement-list]
[return expression ;] }
function-declaration ::= public [static] type function-identifier
( [argument-declaration-list] )
| main-declaration
variable-declaration ::= type object-identifier [ { [ ] } = new int [ integer ] { [ integer ] } ]
main-declaration ::= [static Scanner in = new Scanner ( System . in ) ;]
public static void main ( String args [ ] )
type ::= class-identifier | int | boolean
argument-declaration-list ::= argument-declaration {, argument-declaration}
argument-declaration := type object-identifier { [ ] }
compound-statement ::= { statement-list }
statement-list ::= statement {statement}
statement ::= compound-statement
| assignment-statement ;
| if ( expression ) statement [else statement]
| while ( expression ) statement
| System . out . println ( expression );
assignment-statement ::= variable = expression
| variable = new class-identifier ()
| variable = in . nextInt ()
expression ::= term | expression binary-operator expression
expression-list ::= expression {, expression}
term ::= primary-expression | unary-operator term
primary-expression ::= object | integer | true | false | ( expression )
object ::= variable | function-call
variable ::= this | [object .] object-identifier {[ expression ]}
function-call ::= [object .] function-identifier ( [expression-list] )

1
Syntactic and Semantic Conventions:
The keywords and the token symbols in TinyJava are in bold. Note that TinyJava has symbols {, }, [, and ]
which are distinguished from grammar metasymbols {, }, [, and ], respectively, by underlining.

class-identifiers, object-identifiers, and function-identifiers are the same lexical/syntactic items (i.e., all
are identifiers) but have the semantics given by the appropriate qualifier. Assume that an identifier can
only contain letters (only alphabetic characters), digits, and underscores (_) with the restrictions that it
must begin with a letter, cannot end with an underscore and cannot have two consecutive underscores.
For example, give_2_Joe, tell_me and A45Asm3 are valid identifiers, but 6gh, two__bad, and no_end_ are
not. integer is an unsigned integer.

The unary operators are +, - and ! (negation). Binary operators obey the customary precedence rules, from
highest to lowest:
• multiplicative *, /
• binary additive +, -
• relational inequality <, >, <=, >=
• relational equality ==, !=
• conjunction &&
• disjunction ||

Comments are preceded by // and are terminated by the end of the line.

Your assignment is as follows:


1. Determine the set of tokens which a lexical analyzer would need to recognize.
2. Design and implement a lexical analyzer procedure to read a source program in the above
language and print the next token in the input stream. If the token detected is a valueless token,
such as a keyword, then it is sufficient to print only the keyword. If it has a value, then both the
token type and lexeme should be printed.
3. You will be given several TinyJava programs with which to test your lexical analyzer. These will be
located on the Canvas and will be of the form Test1.java, Test2.java, etc.

Suggestion:
Use the JFlex tool to automatically construct a lexical analyzer for TinyJava from a set of regular
expressions specifying tokens.

Submission:

• You will electronically submit all the necessary files that are required to generate lexical analyzer
to the Assignment 1 dropbox in Canvas by the due date.
• Program submissions will be checked using a code plagiarism tool against other solutions,
including those found on the Internet, so please ensure that all work submitted is your own. Any
student determined to have cheated may receive an ‘F’ in the course and will be reported for an
academic integrity violation.
• Until you are comfortable working on our Linux CSE machines, as a safety precaution, do not edit
your program (using vim or nano) after you have submitted your program where you might
accidentally re-save the program, causing the timestamp on your file to be later than the due
date. If you want to look (or work on it) after submitting, make a copy of your submission and

2
work off that copy. Should there be any issues with your submission, this timestamp on your code
on the CSE machines will be used to validate when the program is completed.

You might also like