You are on page 1of 2

1, Recursive Descent

Recursive descent parsing is a top down parsing technique that contructs the parse tree from
the top and the input is read from left to right. It uses procedures for every terminal and non
terminal entity. It recursively parse the input to make the parse tree, which may or may not
required backtracking. However, the grammar associated with it can’t avoid backtracking.

Handwritten recursive descent parsers are most often used when the language to be parsed is relatively
simple, or when a parser-generator tool is not available. There are exceptions, however. In particular,
recursive descent appears in recent versions of the GNU compiler collection ( gcc). Earlier versions used
bison to create a bottom-up parser automatically. The change was made in part for performance reasons
and in part to enable the generation of higher quality syntax error messages.

How to recursive descent parsing:


Start at the top of the tree and predict the needed productions on the basis of the current left-most nonterminal in
the tree and the current input token. This process can be formalize in one or two ways. The first one, described
in the remainder of this subsection, is to build a recursive descent parser whose subroutines correspond, one-
one, to the nonterminals of the grammar. Recursive descent parsers are typically constructed by hand,
though the ANTLR parser generator constructs them automatically from an input grammar.

Let’s take an example for demonstration:


The parser begins by calling the subroutine program. After noting that the initial token is a read, program
calls stmt list and then attempts to match the end-of-file pseudotoken. (In the parse tree, the root, program,
has two children, stmt list and $$.) Procedure stmt list again notes that the upcoming token is a read. This
observation allows it to determine that the current node (stmt list) generates stmt stmt list (rather than ). It
therefore calls stmt and stmt list before returning. Continuing in this fashion, the execution path of the parser
traces out a left-toright depth-first traversal of the parse tree. This correspondence between the dynamic
execution trace and the structure of the parse tree is the distinguishing characteristic of recursive descent
parsing. Note that because the stmt list nonterminal appears in the right-hand side of a stmt list production,
the stmt list subroutine must call itself. This recursion accounts for the name of the parsing technique.

It has a subroutine for every nonterminal in the grammar. It also has a mechanism input token to inspect the
next token available from the scanner and a subroutine (match) to consume and update this token, and in
the process verify that it is the one that was expected (as specified by an argument). If match or any of the
other subroutines sees an unexpected token, then a syntax error has occurred. For the time being let us
assume that the parse error subroutine simply prints a message and terminates the parse

!!!!What is top down parsing!!!!


They construct a parse tree from the root down, predicting at each step which production will be used to
expand the current node, based on the next available token of input.
For example:
id list −→ id id_list_tail
id_list_tail −→ , id id_list_tail
id list tail −→ ;

The top-down parsing start at the root of the tree (id list) and expand to id id_list_tail. It then matches the id against a token obtained
from the scanner. The parser moves down into the first nonterminal child and predicts that id_list_tail will expand to id, id_list_tail. To
predict this, it needs
to peek at the upcoming token, in this case is the comma, which allows it to choose between the two posible expansions for
id_list_tail. Then it maches the comma and the id, after that it moves down into the next id_list_tail.
!!!What is backtracking!!!
To make things more easier to explain, let’s look at this example:
S → rXd | rZd
X → oa | ea
Z → ai

Input string: read


The parse tree will start with the S (root) and match its yield to the left-most letter of the input, in this case ‘r’. The prodcution of S (S → rXd)
matches with it. Therefore the parser advances to the next input letter ‘e’. It try to expand the non terminal X and check its production from
the left (X → oa), but it does not match the input symbol. To solve this, the parser backtracks to obtain the next rule of X, (X → ea). This
process continue unitl it find all the symbols that matches the input string.

2. Table-Driven Top-Down parsing

In recursive descent parser, each arm of a case statement corresponds to a production, and contains parsing
routine and match calls corresponding to the symbols on the right-hand side of the prodcution. At any given point
in the parse if we consider the calls beyond the program counter (the ones that have yet to occur) in the
parsing routine invocations currently in the call stack, we obtain a list of the symbols that the parser expects
to see between here and the end of the program. A table-driven top-down parser maintains an explicit stack
containing this same list of symbols.

You might also like