Professional Documents
Culture Documents
110 LR and SLR Parsing PDF
110 LR and SLR Parsing PDF
Summer2012 July9st,2012
SLRandLR(1)Parsing
HandoutwrittenbyMaggieJohnsonandrevisedbyJulieZelenski.
LR(0)IsntGoodEnough
LR(0)isthesimplesttechniqueintheLRfamily.Althoughthatmakesittheeasiestto
learn,theseparsersaretooweaktobeofpracticaluseforanythingbutaverylimitedset
ofgrammars.TheexamplesgivenattheendoftheLR(0)handoutshowhowevensmall
additionstoanLR(0)grammarcanintroduceconflictsthatmakeitnolongerLR(0).The
fundamentallimitationofLR(0)isthezero,meaningnolookaheadtokensareused.Itis
astiflingconstrainttohavetomakedecisionsusingonlywhathasalreadybeenread,
withoutevenglancingatwhatcomesnextintheinput.Ifwecouldpeekatthenext
tokenandusethataspartofthedecisionmaking,wewillfindthatitallowsforamuch
largerclassofgrammarstobeparsed.
SLR(1)
WewillfirstconsiderSLR(1)wheretheSstandsforsimple.SLR(1)parsersusethe
sameLR(0)configuratingsetsandhavethesametablestructureandparseroperation,
soeverythingyou'vealreadylearnedaboutLR(0)applieshere.Thedifferencecomesin
assigningtableactions,wherewearegoingtouseonetokenoflookaheadtohelp
arbitrateamongtheconflicts.Ifwethinkbacktothekindofconflictsweencounteredin
LR(0)parsing,itwasthereduceactionsthatcauseusgrief.AstateinanLR(0)parser
canhaveatmostonereduceactionandcannothavebothshiftandreduceinstructions.
Sinceareduceisindicatedforanycompleteditem,thisdictatesthateachcompleted
itemmustbeinastatebyitself.Butlet'srevisittheassumptionthatiftheitemis
complete,theparsermustchoosetoreduce.Isthatalwaysappropriate?Ifwepeekedat
thenextupcomingtoken,itmaytellussomethingthatinvalidatesthatreduction.Ifthe
sequenceontopofthestackcouldbereducedtothenonterminalA,whattokensdowe
expecttofindasthenextinput?Whattokenswouldtellusthatthereductionisnot
appropriate?PerhapsFollow(A)couldbeusefulhere!
ThesimpleimprovementthatSLR(1)makesonthebasicLR(0)parseristoreduceonlyif
thenextinputtokenisamemberofthefollowsetofthenonterminalbeingreduced.
Whenfillinginthetable,wedon'tassumeareduceonallinputsaswedidinLR(0),we
selectivelychoosethereductiononlywhenthenextinputsymbolsinamemberofthe
followset.Tobemoreprecise,hereisthealgorithmforSLR(1)tableconstruction(note
allstepsarethesameasforLR(0)tableconstructionexceptfor2a)
1. ConstructF = {I0, I1, ...In},thecollectionofLR(0)configuratingsetsforG'.
2. StateiisdeterminedfromIi.Theparsingactionsforthestatearedeterminedas
follows:
a) IfA > uisin IithensetAction[i,a]toreduceA > uforallainFollow(A)(Ais
notS').
b) IfS' > SisinIithensetAction[i,$]toaccept.
c) IfA > uavisinIiandsuccessor(Ii, a)=Ij,thensetAction[i,a]toshiftj(a
mustbeaterminal).
3. ThegototransitionsforstateiareconstructedforallnonterminalsAusingthe
rule:Ifsuccessor(Ii,A) =Ij,thenGoto [i, A] = j.
4. Allentriesnotdefinedbyrules2and3areerrors.
5. Theinitialstateistheoneconstructedfromtheconfiguratingsetcontaining
S' > S.
IntheSLR(1)parser,itisallowablefortheretobebothshiftandreduceitemsinthe
samestateaswellasmultiplereduceitems.TheSLR(1)parserwillbeabletodetermine
whichactiontotakeaslongasthefollowsetsaredisjoint.
Let'sconsiderthosechangesattheendoftheLR(0)handouttothesimplifiedexpression
grammarthatwouldhavemadeitnolongerLR(0).Hereistheversionwiththeaddition
ofarrayaccess:
E' > E
E > E + T | T
T > (E) | id | id[E]
HerearethefirsttwoLR(0)configuratingsetsenteredifidisthefirsttokenoftheinput.
E' -> E
E -> E + T T -> id
id
E -> T T -> id[E]
T -> (E)
T -> id
T -> id[E]
InanLR(0)parser,thesetontherighthasashiftreduceconflict.However,anSLR(1)
willcomputeFollow(T) = { + ) ] $ }andonlyenterthereduceactiononthosetokens.The
input[willshiftandthereisnoconflict.ThusthisgrammarisSLR(1)eventhoughitis
notLR(0).
Similarly,thesimplifiedexpressiongrammarwiththeassignmentaddition:
E' > E
E > E + T | T | V = E
T > (E) | id
V > id
HerearethefirsttwoLR(0)configuratingsetsenteredifidisthefirsttokenoftheinput.
E' -> E
E -> E + T T -> id
id
E -> T V -> id
E -> V = E
T -> (E)
T -> id
V -> id
InanLR(0)parser,thesetontherighthasareducereduceconflict.However,anSLR(1)
parserwillcomputeFollow(T) = { + ) $ }andFollow(V) = { = }andthuscandistinguish
whichreductiontoapplydependingonthenextinputtoken.Themodifiedgrammaris
SLR(1).
SLR(1)Grammars
AgrammarisSLR(1)ifthefollowingtwoconditionsholdforeachconfiguratingset:
1. ForanyitemA >uxvintheset,withterminalx,thereisnocompleteitemB >
winthatsetwithxinFollow(B).Inthetables,thistranslatesnoshiftreduce
conflictonanystate.Thismeansthesuccessorfunctionforxfromthatseteither
shiftstoanewstateorreduces,butnotboth.
2. ForanytwocompleteitemsA > uandB > vintheset,thefollowsetsmust
bedisjoint,e.g.Follow(A)Follow(B)isempty.Thistranslatestonoreducereduce
conflictonanystate.Ifmorethanonenonterminalcouldbereducedfromthisset,
itmustbepossibletouniquelydeterminewhichusingonlyonetokenof
lookahead.
AllLR(0)grammarsareSLR(1)butthereverseisnottrue,asthetwoextensionstoour
expressiongrammardemonstrated.Theadditionofjustonetokenoflookaheadanduse
ofthefollowsetgreatlyexpandstheclassofgrammarsthatcanbeparsedwithout
conflict.
SLR(1)Limitations
TheSLR(1)techniquestillleavessomethingtobedesired,becausewearenotusingall
theinformationthatwehaveatourdisposal.Whenwehaveacompletedconfiguration
(i.e.,dotattheend)suchasX > u,weknowthatthiscorrespondstoasituationin
whichwehaveuasahandleontopofthestackwhichwethencanreduce,i.e.,replacing
ubyX.WeallowsuchareductionwheneverthenextsymbolisinFollow(X).However,it
maybethatweshouldnotreduceforeverysymbolinFollow(X),becausethesymbols
belowuonthestackprecludeubeingahandleforreductioninthiscase.Inother
words,SLR(1)statesonlytellusaboutthesequenceontopofthestack,notwhatis
belowitonthestack.WemayneedtodivideanSLR(1)stateintoseparatestatesto
differentiatethepossiblemeansbywhichthatsequencehasappearedonthestack.By
carryingmoreinformationinthestate,itwillallowustoruleouttheseinvalid
reductions.ConsiderthisexamplefromAho/Sethi/Ullmanthatdefinesasmallgrammar
forassignmentstatements,usingthenonterminalLforlvalueandRforrvalueand*
forcontentsof.
S' > S
S > L = R
S > R
L > *R
L > id
R > L
I4: L > *R
R > L
L > *R
L > id
Considerparsingtheexpressionid = id.AfterworkingourwaytoconfiguratingsetI2
havingreducedthefirstid toL,wehaveachoiceuponseeing=comingupintheinput.
ThefirstiteminthesetwantstosetAction[2,=]beshift6,whichcorrespondstomoving
ontofindtherestoftheassignment.However,=isalsoinFollow(R)becauseS => L=R
=> *R = R.Thus,thesecondconfigurationwantstoreduceinthatslotR>L.Thisisa
shiftreduceconflictbutnotbecauseofanyproblemwiththegrammar.ASLRparser
doesnotrememberenoughleftcontexttodecidewhatshouldhappenwhenit
encountersa=intheinputhavingseenastringreducibletoL.Althoughthesequence
ontopofthestackcouldbereducedtoR,wedontwanttochoosethisreduction
becausethereisnopossiblerightsententialformthatbeginsR = ...(thereisone
beginning*R = ...whichisnotthesame).Thus,thecorrectchoiceistoshift.
ItsnotfurtherlookaheadthattheSLRtablesaremissingwedontneedtosee
additionalsymbolsbeyondthefirsttokenintheinput,wehavealreadyseenthe
informationthatallowsustodeterminethecorrectchoice.Whatweneedistoretaina
littlemoreoftheleftcontextthatbroughtushere.Inthisexamplegrammar,theonly
timeweshouldconsiderreducingbyproductionR>L isduringaderivationthathas
alreadyseena*oran=.Justusingtheentirefollowsetisnotdiscriminatingenoughas
theguideforwhentoreduce.Thefollowsetcontainssymbolsthatcanfollow Rinany
positionwithinavalidsentencebutitdoesnotpreciselyindicatewhichsymbolsfollow
Ratthisparticularpointinaderivation.Sowewillaugmentourstatestoinclude
informationaboutwhatportionofthefollowsetisappropriategiventhepathwehave
takentothatstate.
Wecanbeinstate2foroneoftworeasons,wearetryingtobuildfromS > L = R or
fromS > R > L. Iftheupcomingsymbolis=,thenthatrulesoutthesecondchoiceand
wemustbebuildingthefirst,whichtellsustoshift.Thereductionshouldonlybe
appliedifthenextinputsymbolis$.Eventhough=isFollow(R)becauseoftheother
contextsthatanRcanappear,inthisparticularsituation,itisnotappropriatebecause
whenderivingasentenceS > R > L,=cannotfollowR.
ConstructingLR(1)parsingtables
LRorcanonicalLRparsingincorporatestherequiredextrainformationintothestateby
redefiningconfigurationstoincludeaterminalsymbolasanaddedcomponent.LR(1)
configurationshavethegeneralform:
ThismeanswehavestatescorrespondingtoX1...Xi onthestackandwearelookingto
putstatescorrespondingtoXi+1...Xj onthestackandthenreduce,butonlyifthetoken
followingXj istheterminala.aiscalledthelookaheadoftheconfiguration.The
lookaheadonlycomesintoplaywithLR(1)configurationswithadotattherightend:
A > X1Xj , a
ThismeanswehavestatescorrespondingtoX1...Xjonthestackbutwemayonlyreduce
whenthenextsymbolisa.Thesymbolaiseitheraterminalor$(endofinputmarker).
WithSLR(1)parsing,wewouldreduceifthenexttokenwasanyofthoseinFollow(A).
WithLR(1)parsing,wereduceonlyifthenexttokenisexactly a.Wemayhavemore
thanonesymbolinthelookaheadfortheconfiguration,asaconvenience,welistthose
symbolsseparatedbyaforwardslash.Thus,theconfigurationA > u, a/b/csaysthatit
isvalidtoreduceutoAonlyifthenexttokenisequaltoa,b,orc.Theconfiguration
lookaheadwillalwaysbeasubsetofFollow(A).
Recallthedefinitionofaviableprefixfromtheprevioushandout.Viableprefixesare
thoseprefixesofrightsententialformsthatcanappearonthestackofashiftreduce
parser.Formallywesaythataconfiguration[A > uv, a] isvalidforaviableprefixif
thereisarightmostderivationS =>* Aw =>* uvwwhere=uandeitheraisthefirst
symbolofworwisandais$.
Forexample:
S > ZZ
Z > xZ | y
OftenwehaveanumberofLR(1)configurationsthatdifferonlyintheirlookahead
components.TheadditionofalookaheadcomponenttoLR(1)configurationsallowsus
tomakeparsingdecisionsbeyondthecapabilityofSLR(1)parsers.Thereis,however,a
bigpricetobepaid.Therewillbemoredistinctconfigurationsandthusmanymore
possibleconfiguratingsets.Thisincreasesthesizeofthegotoandactiontables
considerably.Inthepastwhenmemorywassmaller,itwasdifficulttofindstorage
efficientwaysofrepresentingthesetables,butnowthisisnotasmuchofanissue.Still,
itsabigjobbuildingLRtablesforanysubstantialgrammarbyhand.
ThemethodforconstructingtheconfiguratingsetsofLR(1)configurationsisessentially
thesameasforSLR,buttherearesomechangesintheclosureandsuccessoroperations
becausewemustrespecttheconfigurationlookahead.Tocomputetheclosureofan
LR(1)configuratingsetI:
RepeatthefollowinguntilnomoreconfigurationscanbeaddedtostateI:
Foreachconfiguration[A > uBv, a]inI,foreachproductionB > winG',andfor
eachterminalbinFirst(va)suchthat[B > w, b]isnotinI:add[B > w, b]toI.
Whatdoesthismean?Wehaveaconfigurationwiththedotbeforethenonterminal B.
InLR(0),wecomputedtheclosurebyaddingallBproductionswithnoindicationof
whatwasexpectedtofollowthem.InLR(1),wearealittlemorepreciseweaddeachB
productionbutinsistthateachhavealookaheadofva.ThelookaheadwillbeFirst(va)
sincethisiswhatfollowsB inthisproduction.Rememberthatwecancomputefirstsets
notjustforasinglenonterminal,butalsoasequenceofterminalandnonterminals.
First(va)includesthefirstsetofthefirstsymbolofvandthenifthatsymbolisnullable,
weincludethefirstsetofthefollowingsymbol,andsoon.Iftheentiresequence vis
nullable,weaddthelookaheadaalreadyrequiredbythisconfiguration.
ThesuccessorfunctionfortheconfiguratingsetIandsymbolXiscomputedasthis:
Wetakeeachproductioninaconfiguratingset,movethedotoverasymbolandcloseon
theresultingproduction.Thisisbasicallythesamesuccessorfunctionasdefinedfor
LR(0),butwehavetopropagatethelookaheadwhencomputingthetransitions.
WeconstructthecompletefamilyofallconfiguratingsetsFjustaswedidbefore.Fis
initializedtothesetwiththeclosureof[S' > S, $].ForeachconfiguratingsetIandeach
grammarsymbolXsuchthatsuccessor(I,X)isnotemptyandnotinF,addsuccessor (I,X)
toFuntilnootherconfiguratingsetcanbeaddedtoF.
Letsconsideranexample.Theaugmentedgrammarbelowthatrecognizestheregular
languagea*ba*b(thisexamplefrompp.231236Aho/Sethi/Ullman).
0) S' > S
1) S > XX
2) X > aX
3) X > b
HereisthefamilyofLRconfigurationsets:
I0: S' > S, $ I4: X > b, a/b
S > XX, $
X > aX, a/b I5: S > XX, $
X > b, a/b
I6: X > aX, $
I1: S' > S, $ X > aX, $
X > b, $
I2: S > XX, $
X > aX, $ I7: X > b, $
X > b, $
I8: X > aX, a/b
I3: X > aX, a/b
X > aX, a/b I9: X > aX, $
X > b, a/b
TheabovegrammarwouldonlyhavesevenSLRstates,buthastenincanonicalLR.We
endupwithadditionalstatesbecausewehavesplitstatesthathavedifferent
lookaheads.Forexample,states3and6arethesameexceptforlookahead,state3
correspondstothecontextwhereweareinthemiddleofparsingthefirstX,state6isthe
secondX.Similarly,states4and7arecompletingthefirstandsecondXrespectively.In
SLR,thosestatesarenotdistinguished,andifwewereattemptingtoparseasingle bby
itself,wewouldallowthattobereducedtoX,eventhoughthiswillnotleadtoavalid
sentence.TheSLRparserwilleventuallynoticethesyntaxerror,too,buttheLRparser
figuresitoutabitsooner.
Tofillintheentriesintheactionandgototables,weuseasimilaralgorithmaswedid
forSLR(1),butinsteadofassigningreduceactionsusingthefollowset,weusethe
specificlookaheads.HerearethestepstobuildanLR(1)parsetable:
S > XX
bX
baX
baaX
baab
AgrammarisLR(1)ifthefollowingtwoconditionsaresatisfiedforeachconfigurating
set:
Aslongasthereisauniqueshiftorreduceactiononeachinputsymbolfromeachstate,
wecanparseusinganLR(1)algorithm.Theabovestateconditionsaresimilartowhatis
requiredforSLR(1),butratherthanthelooserconstraintaboutdisjointfollowsetsand
soon,canonicalLR(1)computesamoreprecisenotionoftheappropriatelookahead
withinaparticularcontextandthusisabletoresolveconflictsthatSLR(1)would
encounter.
Bibliography
A. Aho, R. Sethi, J. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA:
Addison-Wesley, 1986.
J.P. Bennett, Introduction to Compiling Techniques. Berkshire, England: McGraw-Hill, 1990.
K. Loudon, Compiler Construction. Boston, MA: PWS, 1997
A. Pyster, Compiler Design and Construction. New York, NY: Van Nostrand Reinhold, 1988.