Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1


Ratings: (0)|Views: 7|Likes:
Published by Deep Raj Jangid

More info:

Published by: Deep Raj Jangid on Nov 27, 2011
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





LR Parsing
Bell Laboratories, Murray Hzll, New Jersey 07974
The LR syntax analysis method is a useful and versatile technique for parsingdeterministic context-free languages in compiling applications. This paperprovides an informal exposition of LR parsing techniques emphasizing themechanical generation of efficient LR parsers for context-free grammars.Particular attentmn is given to extending the parser generation techniques toapply to ambiguous grammars.
Keywords and phrases:
grammals, parsels, compilers, ambiguous grammars,context-free languages, LR grammars.
CR calegorzes.
4 12, 5 23
A complete specification of a programminglanguage must perform at least two func-tions. First, it must specify the
of thelanguage; that is, which strings of symbolsare to be deemed well-formed programs.Second, it must specify the
of thelanguage; that is, what meaning or intentshould be attributed to each syntacticallycorrect program.A compiler for a programming languagemust verify that its input obeys the syntacticconventions of the language specification. Itmust also translate its input into an objectlanguage program in a manner that is con-sistent with the semantic specification of thelanguage. In addition, if the input containssyntactic errors, the compiler should an-nounce their presence and try to pinpointtheir location. To help perform these func-tions every compiler has a device within itcalled a
A context-free grammar can be used tohelp specify the syntax of a programminglanguage. In addition, if the grammar is de-signed carefully, much of the semantics ofthe language can be related to the rules ofthe grammar.There are many different types of parsersfor context-free grammars. In this paper weshall restrict ourselves to a class of parsersknown as LR parsers. These parsers areefficient and well suited for use in compilersfor programming languages. Perhaps moreimportant is the fact that we can automati-cally generate LR parsers for a large and use-ful class of context-free grammars. The pur-pose of this article is to show how LR parserscan be generated from certain context-freegrammars, even some ambiguous ones. Animportant feature of the parser generationalgorithm is the automatic detection ofambiguities and difficult-to-parse constructsin the language specification.We begin this paper by showing how acontext-free grammar defines a language.We then discuss LR parsing and outline theparser generation algorithm. We concludeby showing how the performance of LRparsers can be improved by a few simpletransformations, and how error recovery and"semantic actions" can be incorporated intothe LR parsing framework.For the purposes of this paper, a
is a string of
terminal symbols.
Sentences arewritten surrounded by a pair of single quotes.For example, 'a',
and ',' are sentences.The empty sentence is written ". Two sen-tences written contiguously are to be con-catenated, thus 'a'
is synonymous with
Computing Surveys, Vol 6, No 2, June 1974
A. V. Aho and S. C. Johnson
In this paper the term
merelymeans a set of sentences.
1 lntroductmn2 Grammars3 Derlvatmn Trees4 Parsers5 Representing the Parsing Actmn and Goto Tables6 Constructmn of a Parser from a Grammar6 I Sets of Items62 Constructing the Collectmn of Accesmble Sets ofItems63 Constructing the Parsing Actmn and GotoTables from the Collectmn of Sets of Items64 Computing Lookahead Sets7 Parsing Ambiguous Grammars80ptlmlzatmn of LR Parsers81 Merging Identmal States82 Subsuming States83 Ehmmatmn of Reductmns by Single Productmns9 Error Recoveryl0 Output11 Concluding RemarksReferences
Copyright (~) 1974, Association for ComputingMachinery, Inc General permission to repubhsh,but not for profit, all or part of thin materml isgranted, provided that ACM's copyright notice isgiven and that reference is made to this publica-tion, to its date of issue, and to the fact that re-printing priwleges were granted by permission ofthe Association for Computing Machinery.
A grammar is used to define a language aridto impose a structure on each sentence inthe language. We shall be exclusively con-cerned with
context-free grammars,
sometimescalled BNF (for Backus-Naur form) specifi-cations.In a context-free grammar, we specifytwo disjoint sets of symbols to help define alanguage. One is a set of
nonterminal symbols.
We shall represent a nonterminal symbol bya string of one or more capital roman letters.For example, LIST represents a nonterminalas does the letter A. In the grammar, onenonterminal is distinguished as a
symbol.The second set of symbols used in a con-text-free grammar is the set of
The sentences of the language gen-erated by a grammar will contain onlyterminal symbols. We shall refer to a termi-hal or nontcrminal symbol as a
A context-free grammar itself consists of afinite set of rules called
productzons. A
production has the formleft-side ~ right-side,where left-side is a single nonterminal symbol(sometimes called a syntactic category) andright-side is a string of zero or more grammarsymbols. The arrow is simply a specialsymbol that separates the left and rightsides. For example,LIST ~ LIST ',' ELEMENTis a production in which LIST and ELE-MENT are nonterminal symbols, and thequoted comma represents a terminal sym-bol.A grammar is a rewriting system. If
is a string of grammar symbols and A --+ flis a production, then we write~A-y ~ a~7and say that
aA'y directly derives a~'y. A
sequence of strings
s0, Sl, - -- , Sn
such that s,-~ ~ s, for 1 ~< i ~< n is said tobe a
of s~ from ~0. We sometimesalso say s~ is
from s0.The start symbol of a grammar is called a
sentent,al form.
A string derivable from thestart symbol is also a
sententml form
of thegrammar. A sentential form containing onlyterminal symbols is said to be a
generated by the grammar.
languagegenerated by a grammar (;,
often denotedby
is the set of sentences generated byG.
2.1: The following grammar,hereafter called G~, has LIST as its startsymbol:LIST --~ LIST ',' ELEMENTLIST --* ELEMENTELEMENT ~ 'a'ELEMENT --~
The sequence:LIST ~ LIST ',' ELEMENTLIST ',a'LIST ',' ELEMENT ',a'LIST
is a derivation of the sentence
'a,b,a'. L(G~)
consists of nonempty strings of a's and b's,separated by commas.Note that in the derivation in Example2.1, the rightmost nonterminal in each sen-tential form is rewritten to obtain the fol-lowing sentential form. Such a derivation issaid to be a
r~ghlmost der~valzo~
and each sen-tential form in such a derivation is called a
mght se~le~t~al form.
For example,LIST
is a right sentential form of C1.If
is a right sentential form in whichw is a string of terminal symbols, and
then ~ is said to be a
of s~w *For example,
is the handle of the rightsentential formLIST
in Example 2.1.
Some authors use a more restmctmg dehnltmn ofhandle
LR Parsing
101A prefix of a~ in the right sentential form
is said to be a
wable prefix
of the gram-mar. For example,LIST ','is a viable prefix of G1, since it is a prefix ofthe right sentential form,LIST ',' ELEMENT(Both s and w are null here.)Restating this definition, a viable prefix ofa grammar is any prefix of a right sententialform that does not extend past the right endof a handle in that right sentential form.Thus we know that there is always somestring of grammar symbols that can be ap-pended to the end of a viable prefix to ob-tain a right sentential form. Viable prefixesarc important in the construction of com-pilers with good error-detecting capabilities,as long as the portion of the input we haveseen can be derived from a viable prefix,we can be sure that there are no errors that
be detected having scanned only thatpart of the input.
Frequently, our interest in a grammar isnot only in the language it generates, butalso in the structure it imposes on the sen-tences of the language. This is the case be-cause grammatical analysis is closely con-nected with other processes, such as compila-tion and translation, and the translations oractions of the other processes are frequentlydefined in terms of the productions of thegrammar. With this in mind, we turn ourattention to the representation of a deriva-tion by its
demvatwn tree.
For each derivation in a grammar we canconstruct a corresponding derivation tree.Let us consider the derivation in Example2.1. To model the first step of the derivation,in which LIST is rewritten asLIST ',' ELEMENTusing production 1, we first create a rootlabeled by the start symbol LIST, and thencreate three direct descendants of the root,labeled LIST, ',', and ELEMENT:
Coraputmg Surveys, Vol 6, No 2, June 1974

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->