89 views

Uploaded by Hiren Vaghela

Recursive Descent Parser

Recursive Descent Parser

Attribution Non-Commercial (BY-NC)

- Instituto Tecnológico y de Estudios Superiores de Monterrey Campus Estado de México
- cse
- Sablecc, An Object-Oriented Compiler Framework
- Five Minute Introduction to ANTLR 3
- Introduction to NLP
- Resume
- Compiler QBank From CD
- Comp.8th Sem iit
- Lexical Analysis
- CD Course File
- Compiler Design Introduction (2)
- Ecs-603 Compiler Design 2013-14
- advsys
- AH Computing All 2012
- December 2010 Ugc Computer Science Paper II Solved
- syllabus
- c03 Syntax
- Correlation and causation
- CD Intermediate Code 06012015 033437am
- rubric

You are on page 1of 41

21 March 2013!

OSU CSE!

1!

BL Compiler Structure

Code Generator abstract program string of integers (object code)

Parser

21 March 2013! OSU CSE! 2!

! Design a context-free grammar (CFG) to specify syntactically valid BL programs ! Use the grammar to implement a recursive-descent parser (i.e., an algorithm to parse a BL program and construct the corresponding Program object)

21 March 2013!

OSU CSE!

3!

Parsing

! A CFG can be used to generate strings in its language

! Given the CFG, construct a string that is in the language

! Given a string, decide whether it is in the language ! And, if it is, construct a derivation tree (or AST)

21 March 2013! OSU CSE! 4!

Parsing

Parsing generally refers to this last ! A CFG can step, be used to generate strings in i.e., going from a string (in the its language language) to its derivation tree or for aconstruct programming language ! Given the CFG, a string that is in perhaps to an AST for the program. the language

! Given a string, decide whether it is in the language ! And, if it is, construct a derivation tree (or AST)

21 March 2013! OSU CSE! 5!

A Recursive-Descent Parser

! One parse method per non-terminal symbol ! A non-terminal symbol on the right-hand side of a rewrite rule leads to a call to the parse method for that non-terminal ! A terminal symbol on the right-hand side of a rewrite rule leads to consuming that token from the input token string ! | in the CFG leads to if-else in the parser

21 March 2013!

OSU CSE!

6!

expr term factor add-op mult-op digit-seq digit ! expr add-op term | term ! term mult-op factor | factor ! ( expr ) | digit-seq !+|! * | DIV | REM ! digit digit-seq | digit !0|1|2|3|4|5|6|7|8|9

21 March 2013!

OSU CSE!

7!

A Problem

expr term factor add-op mult-op digit-seq digit ! expr add-op term | term ! term mult-op factor | factor ! ( expr ) | digit-seq !+|Do you see a ! * | DIV | REM problem with a recursive descent ! digit digit-seq | digit parser for !0|1|2|3|4|5|6 |7|8 | 9this CFG? (Hint!)

21 March 2013!

OSU CSE!

8!

A Solution

expr term factor add-op mult-op digit-seq digit ! term { add-op term } ! factor { mult-op factor } ! ( expr ) | digit-seq !+|! * | DIV | REM ! digit digit-seq | digit !0|1|2|3|4|5|6|7|8|9

21 March 2013!

OSU CSE!

9!

A Solution

expr term factor add-op mult-op digit-seq digit ! term { add-op term } ! factor { mult-op factor } ! ( expr ) | digit-seq !+|The special CFG symbols { and } ! * | DIV | REM mean that the enclosed sequence of ! digit digit-seq digit or more times; symbols occurs| zero ! this 0|1 |2|3 | 4 | 5a |6 |7|8|9 helps change left-recursive CFG into an equivalent CFG that can be parsed by recursive descent.

OSU CSE! 10!

21 March 2013!

A Solution The special CFG symbols { and } also simplify a non-terminal for a number ! term { add-op } zeroes. that has no term leading ! factor { mult-op factor } ! ( expr ) | number !+|! * | DIV | REM ! 0 | nz-digit { 0 | nz-digit } !1|2|3|4|5|6|7|8|9

OSU CSE! 11!

21 March 2013!

A Recursive-Descent Parser

! One parse method per non-terminal symbol ! A non-terminal symbol on the right-hand side of a rewrite rule leads to a call to the parse method for that non-terminal ! A terminal symbol on the right-hand side of a rewrite rule leads to consuming that token from the input token string ! | in the CFG leads to if-else in the parser ! {...} in the CFG leads to while in the parser

21 March 2013! OSU CSE! 12!

More Improvements

expr term factor add-op mult-op number nz-digit If we treat every number as a token, ! term { add-op } then thingsterm get simpler for the ! factor { mult-op factor } only 5 nonparser: now there are terminals to worry about. ! ( expr ) | number !+|! * | DIV | REM ! 0 | nz-digit { 0 | nz-digit } !1|2|3|4|5|6|7|8|9

21 March 2013!

OSU CSE!

13!

More Improvements

expr term factor add-op mult-op number nz-digit If we treat every add-op and mult-op ! term { token, add-op term } even simpler: as a then its ! factor { mult-op factor } there are only 3 non-terminals. ! ( expr ) | number !+|! * | DIV | REM ! 0 | nz-digit { 0 | nz-digit } !1|2|3|4|5|6|7|8|9

21 March 2013!

OSU CSE!

14!

Can you write the tokenizer for this language, so Improvements every number, add-op, and mult-op is a token? expr term factor add-op mult-op number nz-digit ! term { add-op term } ! factor { mult-op factor } ! ( expr ) | number !+|! * | DIV | REM ! 0 | nz-digit { 0 | nz-digit } !1|2|3|4|5|6|7|8|9

21 March 2013!

OSU CSE!

15!

! For this problem, parsing an arithmetic expression means evaluating it ! The parser goes from a string of tokens in the language of the CFG on the previous slide, to the value of that expression as an int

21 March 2013!

OSU CSE!

16!

Structure of Solution

"4 + 29 DIV 3" <"4", "+", "29", "DIV", "3"> Tokenizer string of characters (arithmetic expression) string of tokens Parser value of arithmetic expression

13

21 March 2013!

OSU CSE!

17!

"4 + 29 DIV 3"

<"4", "+", "29", "DIV", "3">

13 Parser

21 March 2013!

OSU CSE!

18!

Parsing an expr

! We want to parse an expr, which must start with a term and must be followed by zero or more (pairs of) add-ops and terms:

expr ! term { add-op term }

! An expr has an int value, which is what we want returned by the method to parse an expr

21 March 2013!

OSU CSE!

19!

/** * Evaluates an expression and returns its value. * ... * @updates ts * @requires * [an expr string is a proper prefix of ts] * @ensures * valueOfExpr = [value of longest expr string at * start of #ts] and * #ts = [longest expr string at start of #ts] * ts */ private static int valueOfExpr(Queue<String> ts) {...}

21 March 2013!

OSU CSE!

20!

Parsing a term

! We want to parse a term, which must start with a factor and must be followed by zero or more (pairs of) mult-ops and factors:

term ! factor { mult-op factor }

! A term has an int value, which is what we want returned by the method to parse a term

21 March 2013!

OSU CSE!

21!

/** * Evaluates a term and returns its value. * ... * @updates ts * @requires * [a term string is a proper prefix of ts] * @ensures * valueOfTerm = [value of longest term string at * start of #ts] and * #ts = [longest term string at start of #ts] * ts */ private static int valueOfTerm(Queue<String> ts) {...}

21 March 2013!

OSU CSE!

22!

Parsing a factor

! We want to parse a factor, which must start with the token "(" followed by an expr followed by the token ")"; or it must be a number token:

factor ! ( expr ) | number

! A factor has an int value, which is what we want returned by the method to parse a factor

21 March 2013! OSU CSE! 23!

/** * Evaluates a factor and returns its value. * ... * @updates ts * @requires * [a factor string is a proper prefix of ts] * @ensures * valueOfFactor = [value of longest factor string at * start of #ts] and * #ts = [longest factor string at start of #ts] * ts */ private static int valueOfFactor(Queue<String> ts){ ... }

21 March 2013! OSU CSE! 24!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); while (ts.front().equals("+") || ts.front().equals("-")) { String op = ts.dequeue(); if (op.equals("+")) { value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 25!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); while (ts.front().equals("+") || ts.front().equals("-")) { String op = ts.dequeue(); if (op.equals("+")) { value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 26!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); while (ts.front().equals("+") || This method is ts.front().equals("-")) { very similar to String op = ts.dequeue(); valueOfExpr. if (op.equals("+")) { value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 27!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); Look ahead while (ts.front().equals("+") || one token in ts.front().equals("-")) { String op = ts.dequeue(); ts to see if (op.equals("+")) { whats next. value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 28!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); while (ts.front().equals("+") || Consume ts.front().equals("-")) { the next token String op = ts.dequeue(); from ts. if (op.equals("+")) { value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 29!

private static int valueOfExpr(Queue<String> ts) { int value = valueOfTerm(ts); Evaluate while (ts.front().equals("+") || (some of) the ts.front().equals("-")) { String op = ts.dequeue(); expression. if (op.equals("+")) { value = value + valueOfTerm(ts); } else /* "-" */ { value = value - valueOfTerm(ts); } } return value; }

21 March 2013! OSU CSE! 30!

private static int valueOfTerm(Queue<String> ts) {

}

21 March 2013! OSU CSE! 31!

private static int valueOfFactor( Queue<String> ts) { int value; if (ts.front().equals("(")) { ts.dequeue(); value = valueOfExpr(ts); ts.dequeue(); } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 32!

factor Code

private static int valueOfFactor( Queue<String> ts) { int value; if (ts.front().equals("(")) { ts.dequeue(); value = valueOfExpr(ts); ts.dequeue(); } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 33!

private static int valueOfFactor( Queue<String> ts) { int value; Look ahead if (ts.front().equals("(")) { one token in ts.dequeue(); ts to see value = valueOfExpr(ts); whats next. ts.dequeue(); } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 34!

private static int valueOfFactor( Queue<String> ts) { int value; if (ts.front().equals("(")) { What token ts.dequeue(); value = valueOfExpr(ts); does this ts.dequeue(); throw away? } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 35!

Though method is called parseInt, it Code for Parser for factor is not one of our parser methods; it is a private static int valueOfFactor( static method from the Java librarys Queue<String> ts) { Integer class (with int utilities). int value;

if (ts.front().equals("(")) { ts.dequeue(); value = valueOfExpr(ts); ts.dequeue(); } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 36!

private static int valueOfFactor( Queue<String> ts) { int value; if (ts.front().equals("(")) { ts.dequeue(); value = valueOfExpr(ts); ts.dequeue(); } else { String number = ts.dequeue(); Recursive descent: notice that value = valueOfExpr Integer.parseInt(number); calls valueOfTerm, } which calls valueOfFactor, return value; which here may call valueOfExpr. }

21 March 2013! OSU CSE! 37!

private static int valueOfFactor( Queue<String> ts) { How do you int value; know this if (ts.front().equals("(")) { ts.dequeue(); (indirect) value = valueOfExpr(ts); recursion ts.dequeue(); terminates? } else { String number = ts.dequeue(); value = Integer.parseInt(number); } return value; }

21 March 2013! OSU CSE! 38!

A Recursive-Descent Parser

! One parse method per non-terminal symbol ! A non-terminal symbol on the right-hand side of a rewrite rule leads to a call to the parse method for that non-terminal ! A terminal symbol on the right-hand side of a rewrite rule leads to consuming that token from the input token string ! | in the CFG leads to if-else in the parser ! {...} in the CFG leads to while in the parser

21 March 2013! OSU CSE! 39!

Observations

! This is so formulaic that tools are available that can generate RDPs from CFGs ! In the lab, you will write an RDP for a language similar to the one illustrated here

! The CFG will be a bit different ! There will be no tokenizer, so you will parse a string of characters in a Java StringBuilder

! See methods charAt and deleteCharAt

21 March 2013!

OSU CSE!

40!

Resources

! Wikipedia: Recursive Descent Parser

! http://en.wikipedia.org/wiki/Recursive_descent_parser

! http://docs.oracle.com/javase/7/docs/api/

21 March 2013!

OSU CSE!

41!

- Instituto Tecnológico y de Estudios Superiores de Monterrey Campus Estado de MéxicoUploaded byEgi Nurcahyadi
- cseUploaded byriyasekaran
- Sablecc, An Object-Oriented Compiler FrameworkUploaded byyogeshmsharma
- Five Minute Introduction to ANTLR 3Uploaded byPhuong Thao Pham
- Introduction to NLPUploaded byFerooz Khan
- ResumeUploaded byBhanu Prathap R
- Compiler QBank From CDUploaded bykalai
- Comp.8th Sem iitUploaded bybipin09103034
- Lexical AnalysisUploaded byNitin Birari
- CD Course FileUploaded byVasu Devan
- Compiler Design Introduction (2)Uploaded bySibarama Panigrahi
- Ecs-603 Compiler Design 2013-14Uploaded bySandeep Vishwakarma
- advsysUploaded byBenjamin Culkin
- AH Computing All 2012Uploaded byesoel
- December 2010 Ugc Computer Science Paper II SolvedUploaded byvanippc
- syllabusUploaded bytomastorrez
- c03 SyntaxUploaded byBama Raja Segaran
- Correlation and causationUploaded byHector
- CD Intermediate Code 06012015 033437amUploaded bykiet edu
- rubricUploaded byapi-378560513
- System Software Jan 2010Uploaded byPrasad C M
- 11CS10038Uploaded byNick Soni
- Wadler Xsl SemanticsUploaded byJustin Lee
- compiler-notes-unit-iii.pdfUploaded byjagdish
- Older White Paper ExampleUploaded bylaconically
- Compiler 12Uploaded byMike Doley
- D01ETL.pptUploaded bynikhil srivastav
- ETL_CONCEPTS.pptUploaded bynikhil srivastav
- Level 1 Session 1 2006 Mark SchemeUploaded byxlup
- A ProgrammingUploaded bynervgas

- Kannada Parts of SpeechUploaded byNarayanan Kannan Subramanyan
- Verb Patterns - Exercise to Learn EnglishUploaded byElaine Luu
- Reflective JournalUploaded bySYn Schecter
- Reduplication in CIA-CIA LanguageUploaded byariwangga
- English Writing Conventions GrammarSpellVocabUploaded byYanSuo
- Error No.: 5 Error: Non-Parallel Structures Some Common Examples *ItsUploaded bytwy113
- (262368016) Countable vs Uncountable Nouns_PDFUploaded byMenphbruno27
- exposition programUploaded byapi-184739074
- Hamm FramesemanticsUploaded byalinenardes
- Agreements with auxiliaries Quiz.pdfUploaded byabdollahseyedi2933
- English GrammarUploaded byhima67
- Deixis and DistanceUploaded byadrian33
- Sahil's Grammar NotesUploaded byasleshabelliappa
- Past Simple Past Continuous Present Perfect Simple Present Perfect Continuous Past PerfectUploaded byslavica_volkan
- 100227 Nose Szeged Cases3Uploaded byMasahiko Nose
- Planif Unit Gold XIIUploaded byIrene Teach
- Adverb ClausesUploaded byabingv
- English Worksheet (Gender of Nouns-Collective-Singular-Plural)Uploaded byTina Tan
- The Structure of English Sentence by Mustafa AbdulsahibUploaded byMustafaAl-Hassan
- New-Language-Leader-IELTS-mapping-Pre-Intermediate.pdfUploaded bygabriel sanchez mora
- Contrastive Analysis Between English and Arabic PrepositionsUploaded bySadeq Alsharafi
- Future Perfect and Future ContinuousUploaded byJuliaMaleszewska
- A vs ANUploaded byjrvilela_unp656
- Tenses in englishUploaded byJoe Erald
- Past Simple vs Past ContinuousUploaded byMarta To Cu
- Test Sumativ Clasa a v-A BUploaded byLiliana Gheorghe
- Mengenali Macam Tenses Bentuk TabelUploaded byDedeng Sukmana Abu Syifa
- rpt bahasa inggeris 2018 kump aUploaded byfiqahfauzi
- Sample Chapter Comparative-SuperlativeUploaded byNiki Mavraki
- Noun VocabularyUploaded bySumayyah Ahmad