0% found this document useful (0 votes)
5K views107 pages

Compiler Design Notes

Uploaded by

Raghujeet Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
5K views107 pages

Compiler Design Notes

Uploaded by

Raghujeet Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
, , > > » » P > > ° ° > > > » > > > > > 2 > ? > > > i ’ , Ss UNIT-| [ndroduction bo Conbiles” © Introduction Eo COmPret Tobicns thases amd lames , Gontstrapping, Finite s Tequlons exlpremions amd then applicokion to © pAiminabion ©f DEA- Based fablerns Mokchers gimplemer lexical ambyzens , lexical analyzer gener Formal gramme and thei abplication to Syntax Notahon , ambiguity, YACC: The syntachc speci ficotton of a € bropraming languag: CFG, denivabion amd parse 45 sGpailiha € of Cee. e H Trarsialor t= A -tamilater dia progam that iuker am Inpuk © 2 broqram wattien in one proparning CSounce language) and parodutss af output a prona~ in Another language target language). Source language —Abraruiakor [—>-tangek language PHO s-— A, Compiler Ji a program takes a propia 4 {pput written in high -Jevel language amd * Com nnn language & oneenonn Convnt it info equivaluat Sow Suvet Samguayt ¢ GF Machine language Ske Wiembly language was output proFVIM.E Pro! Ceonpir t ae) oem a5 ow dana) (high level ) (error menoge ) ¢ ¢ ¢ . ef * ¢ Duxing Hh convention Hf Zeme evrons OCcuwred then Compiler e Aisplays error menoge- Properties -: compiler Should be following. AL — Bug free — portable — must have wppaiatent opHmization i a a a ey yale machined 4 @ jexicol any @ jlion$ € ators, LEX fmpilors e amoliyit , OF © s Ll 7 “ — & 5 c-~> € 2 a) a) € 3 ea Cc 2 © > ca Ca es cH oe oe oie o.2 ~~» “=. 3 » “+ i) a La 3 =. Os Soe Ce» el, ely Ls, | 7 Sl, %® Phares of compiler t Arr Sounte ——> Compilar—> Machine \natruck en Progtam Sows The pmax o} cmpilalfon di a complex broom AO therfore * ° Hons 6+ We divide the whole pres g compiler into pantiiem Into g peniss of subprend catted hans of Compiler - Sounce Program L } Lexicak Analysis stream @ Toker Syntax Ai Wu Semantic Analysts Scanning Symbol table Target Program 1) Lexiead Anahi V4 ts dio alld scamming, tn His fist fave ef Compilaten the Complele Sounce bron as scamned amd scanner separate chanadeu of Sounce language into qroubs er token. outbat of this phoue Jk stream of tokens urbich posed to next Phase. DAAPAAAABAAAAAAA BARAK LE: i Oe i) BD Syrtox Analysis s Mt ts lao cabled Pumring fase . 1 pris he boken gprurcaed by lexih analyzer one eouped | Loyether to fo7M aq syntactic Struchune , Tis syn tacks shudore: Ts coed Panne tree Oy syntax tree. 3) Semantic Analysis & Semantic analysis € s ; «In the expssion Matching of panan thers , matching Tf -elke amd ete 4) Intenmediakt Code Gentrader - The Tnteamediak code generator taker Syntax tee a fnput from Semantic Phaie amd 3 : adebien (de Cit comists of instruct each ¢ which has atmost three Obetand ) 5) Code opHmizagion & Thi dsan optona’ phase . In this phoue Code take lew Space amd nuns fasten 6) code generabionz- In ths brave the gemerroded « The \nkermediate code inghuchem ane hamlated nto pequene of Machine inthucton Symbol Lables The data Sirudune wed to record the ekenlial Enformokion amd Comaims qvpcord that altows Us to find detail for each idembiSien quickly amd to store er wertiewe dofia fom the record quickky. ~~ VY VUYYUUUUVYUMEHOUHEbCYUHYUYEESUYVEess Ever hamdtens- The exer handler ‘reports when an eres ocured in amy bud of sowne program . VEE VHEHOHVUHEHEEEUDSS ONY Ys Given Rtakement wwwx Position := nial 4 Take #60 Tales tar 4 143 #60 Jrorsition a wrwridw Atademe nt symbol Table Mov F id £2. Mun F 460-0) RL HovF Ida, Rt AppF Rr Ry MovF Qi sid} ox Panes 6 Compilors- Portion ef one er more phases RR ee Combined into a module colded a pay, Then art +00 AyPU g paws gf compiler ane following. CD Aingle Pass Compiler —: Ht Scan the Sounee Proyrem Ih one Hme H means ingle pax compiler conta olf the phates 6} Compiler. 2) Multi pass Compiler-z i Scam the Sowce program amd product am Suh a cornbiler is called Muth paws combiler. B ina t . % conan “ Bootshobbing Ua technique 4 writing | «The languags BASIC, C, JAVA PYTHON ane bootstulpbed. A Compiler du chanackaized in tye Jamguages Sounte , target and imbentadion lamguages - T Diagram S 7 L Sowne lang-—> target lang. jr dImpleted jn “K Cross Cmbilox t= A cross. compilen dt a compilen which is WR Ana Cabable for creating executable Code which oO €9s window 1° combilen Produces code that wuns on amd-nid algo. a TTF VCO OUEE SSE HHHBOBIEE KLEE ° can ss TOKEN = Ih is Wasvcal é ~~ isha’ Smallest individual umit ¢ aan Progam 4s cated Token. In C Programming: lamg- Keywor Tantifiew , operates , punefuaters aut 4oKemt SHE Calculate total number of tokent In the follousing =! A ° —Conxtont = a an i Vos To! no id dyeratey MinCtuater (0 Snt maint) { 17 CHONKE C) Sendithe sting inside quetetion to 1 he Standand output Cehe diablo) Print# ("Welcome to Gren") wetun © 5 5 Total no- of tokens = 4 itty cone oO i Oe batt tt $4 -~ yey oon AH HK tap esses R/-- -- +4 te ag Print Ft ("%d hol", a,b) 5 [Tetat noe ovens = 28 Lexeme* I+ is @ sequence of Chonackrs In He Soune code Hat SAN i ore matched by given predefined Pakienn for Every lexeme to be Specified a4 a yokid token, £82 main Ju lexeme of type Tdentitien (eoken) sem Cs) 242} ane jexemes 4 type puncuskion (token) Patunn &- lb shecifieg a Set Of mbes that a scanner follows to Create a token - : oq “ SK Finite Rutomata Ponte Automata Js the fine stake. ~— machine «1+ 1S an abs tack mathine Ve rad input trom input tape and after recogatz! ." AIMBY G05 “yeu? og! Sa esau tat OW iy Not. Fint}e Automoda. a nee Oclemministic finite Rulomals) — Gon gekaministc Arike Autowats) OW) Deleominishe Finite Aukomake (oF A)-2 OFA us dekeaminishe in One balh-forshecikc nbs Ganneed sich to not wie OFA is one. in which eadh move 1s umiquaby Idermined by the Cunnenk Configuration. Definition :- A Deknminisiie Finite aukomada difined by S tuples - M= (Q,S, 6, Gof) whee Q= A Finite non empty set 9} stake S= A finite non empty Get of pet sembolx Yo Jo is lke initid ste, WES Fe Sk of Final Stake or accepting Btode. § = Transition fundion 82 Q xe >Q Quer: Conttuck a pF A which accept arings Stesding with @ over S (ayb). Q= {%0,4,,%3 E- Lar} Io = {Is Fe{ut $3 Qxz7Q8 C$ OC OOO ewsevrvvevrrervvewe es © Trowiititn table Quy, Construck 9 DFA which accept all stings ~ (y Gnteining Om. ty Ending wth a — th Containing q L=[4,aa,aaa,ba, ab, abab, «-—- oe) Be é a 7 b ) Coding hg Lz fa, 40,200, ba, babe, aba, abaaba.- f Q ag Om @ v * Gutin deign a AF Hkh aces al pking ‘Geet (in tuhtch Aecond Symbol is and 4” Symbol ig 4 7 Guy eA for E242!) and ending wih 9, ~ Q “Guy DEA oven = = Laibh for ending with abb ¢ L 2 Que DFA tn usbich 374 symbol ‘from the left romd Sy caleyays & 7 POS ONE a." as (2 Non-deerministic Finite Audomele cw DeAa) e Nore is difin . from OF Ain 3 Ma ange of § in nEA Is in the pouxret 2% In it the Sted to ushich the machine moves comtt be dotyrmined he TH TS Called hon- dektaminialc Fini, astemode- Definition t- A NFA is difinal by S tuples - Me (Q,S,%0,8,F) whut, g = A Finite non embty sf States s. = Asihile set @ input symbols Yo= Wis coltsd inihial stak 5 Led Fe she final steer s S$: Qxs—5 28 Qu Construd NFA had: ve.coqniees the Lomgus, G@ryk abb ons {ab}? I ! ! 1 | SUM bd Uede leu a i ide is a2 Qu Gmshuck an wea tn whicn g" symbol fom sight Side Is a2 GM a @ at @ out ©) Derign am nFAmuhich sterts aq ox bb? et Design am NFAmwhich Shing’ ends with aa er bbl P | F e% + Dein am wEA ‘in which Le L@bv aba 7 AANDMDD AD AMH-MO HO 9.9 iaaeee u Qua NFA of AW binwy brings in which 24 Jat bit isd overfeny regular exPrenien> (041)*) (+1! G06 fy Wfnence befiveen DFA amd nea & PERI PR PPR AAs pth Pres se pip) DEA NEA © WW DFA muttipte choices ane nat | {o WEA mubkiple choies one Awalable +o chamge He State | auaillable to chamge He state Fer a bosdieulon onut. for a positon Snbuk» @ In ofa In NEA epbillon 1S aloud DFR not accept ebsillen . _ [NER accepts absitlon, © st esse $s 8xsg90%) ® Designin . oe ° 3 and undurtending derlying amd wndurtends Mate teuts fy eye Conversion NEA to ‘DE A + DONS Acne Qu Convert the following OFA ‘nko ofA ~ Smt To convert an pf into DEA Follow the gise Stpr~ (> Make the tramiton table @} piso NFA. U1) Conrhuct the tramiten table fer DEA Correibordiing 40 NEA Honsiten table. 4 tram Hon 4 Ui) Gonstud a DEA Coretpondsny to Phe DEA. Y Roa} Lo} {20,223 19} “NEA tlh ebsilon EFA) tte make auitring. Cony. we Zoe! Chamge th Provided the E- moves I+ mewn “Om. Snbid . \n Aogrumy Ht is Shown by ¢- Buse Design a PA which ateepts all He strings Of @., 6 which either. Ateorting with ab or ending ba uring € - frames Hon? € 9@—@ 9 ub TORRES chen moves (uthout -£.) t Teena ————— From a e-neA follow oe ee. «e . . » ws of the ,Skbs given belouy : Cc can oo LMU: To remevt epsilon movet find afl production o 21 amd Bhow Same production from Uy 61+ % js Hnal moke 2, Hinak .@ T4 4, initial moke Ur init g aA 6 6 & with & eoseeeneaada > 3 Mintenigtion of DENS teal au. Minimize Yhe defi Construded by the om tromition igble ? > ° ~ > A} Uy » a! do ~ aX vw - * | 43 ~ ty) 43 » Vs} % = ail ds 2 a) ee 12 Ger use can $y nd - Ko =( {3}, {%, 45% tets te} ) Ky = C4 ash, [ 40,1, Is r%}, $92,245, 4491) Roz (4%, $4o,%h 44,98} Str teh, Fad) Ay: (1433, Yao te3, $215 %sh, fda tay, 4u}) os Nae Ar 2 Stop the bea .o eevee OOF EEC EUV EEEUTY * 2 \ Lo se} o 4,43} $4,454 | Rote | fre fa,, de} | {23g | fats $233 Sto 26} {aad | Stord BS | pee debe CCC CECE LE - "% Regulon chrom ion - Reguban exprasions ant Uy 1? archer arene . e Vomathiray Vike that she anathmadjeoh exbenin- NM provide usefull and convenient natah’on for token, (1) Any Jenrm'na Sembo C abuk symbol)» € amd g ane veauboe exprepion . € 3 tzfe} a 5 L=fal ace mA DONO ONO OM i) ve ; . ae &r © hoo segulos expreions hun Union, On okt both are algo sequin exbremions . Pe L CRrt Re) = LOR UeRL) CR] wy = UCR) LCR) WD TE Ry is a regules exbsenion thon then wesahlon Re ts Mio @ yegulay expoenitn, . Lom) = feet” Qua. CG) RE, ends with of ,E=fo1} On)*¥ 0} Gb RE, stot amd end with b bare h Gi) RE, Second symbol 0 amd — (O41) 0 Con 1 CORD UM sqmbol Sy 4. ‘ GY RE, has exaty 4 one s of 1 of | OF} of | of A Formal Cetomnans amd Yhein abpligtion +o Sxptax ama sis —t DARA ARE SESE Gr errenan $ Any G-rarmays Gr js defined by + suples— = (WT, hs) where; Ve Set ©} Vastiables er non-baminols T= Sete} Terminals ey input symbol4 Se TE 1S Caled the staat Variable .Clerminal stings 4 te Vamguage ane alinays dinived uti we skank from Steking Vosriatie) . Ps Set of Produdign-sules- Cfrodudion one olwoys contain haod ¢ body which Contaim the sequence 4 terrninal Non-taminals but at Lor One Non terminal Ahowld be thene in the hasd ) ; Se Ge ({smi0}, darby, 5) S$ 2anb Asanla Bo bei Be Context Free Grarmen (CEG) t- lt iS cegined by 4 tuple Con Re G= (vw TPS) whee, V= Set of variables oy non tenminals | ubben core Scio Cho symbols does not abptan in the lamjuage) T= Sete} tnminas 67 input Symbol | Jower we debtrnat Ghe strings of the lamguoge aru the Aequamce Gf Feaminall) f= sex of Poduwion wes - (Te Producten vukes ant olud vy ‘in’ the form’ of 7 As. fn cem where AEV ¢ « cvUT) a) = Stanting Symbol of Variable 7) Cushene we Atank the danivation Feom Krorcting Symbol enly them WE GoW ‘ge the atings ef oun Sarnquoge.) Fx derivation wen or fame fre t pfderiuation Aree or Pante tte RSL MaR Tet OT NOMAe eee dry Batis fying the following- > Evuy node has a label which is @ variable or taminal ev €- > The toot hasa Label fora CFG, A= (yt, PSD toa > The label of an } \ ‘ > VE the nods with: labels Xp Mr gees X,, ae he childsum | & Node with label A then a produttion Aa x1 X2--- Xie should Gxist iin P, > The node 1s Only child of Uk pawnt if its label ise. @92 Dvaw a Poe tree for id tide id by €> ete |exel id ESE+E| Exel id £ ay +E 1 ZI id £ | id creuid: f+ id id (1) le fk most denivation aa lS ¥* E | id the left most varuable ak evry Ateb- as Everel ext id BEE > WdtE> > idt EE 4 iat id ® E a idaxid Id qesutk etdtiaa IS wult = idaide id Se c HHHAANADAAMAAAAADH AON HDHDHOH HON € a 5 22 wight moatanrivakion ¢- A derivation ado is ated a vight . moat derivation fF wt apply a production Oniy to He vight mosk variable ak every Alb: SVE ESE tELEmE id vowh = idk Idi E> Exe > EX ETE > Ex E+ fd > Ew ide id / Pid id tid eUUb = ide atid A Ambiguity © a terminal shiny w € t Can) $5 ambiguous FF thet exists hoo os mor derivation tres for St Men exists too er move left moat derivation af string w- fA CEG ‘GIs ambiguous 14 fue Bvishs some we L(o1) Which 'S amby guow. Sge EDETELEHE]Id — resutts Id-+ride id E cme AY | E E id 1 fc id 4 aus the given’ Grameen 34 se Capabilities of CRG A) Ware > CFG Ss Uieful to describe Moat of PIrarming lenguog. > CF iS capable o} AeScibing nud strutunes NW Ket balamied poranihens , Makching' begin -end., iF 7lee amd so on. 2) the garments prbaly designed then an efficient bauer fom be Conshrured ankornol(collsy, 43 F Guiten Exblain formal qranmonr and i&s abblicobien F Symtax amoalyzer 2 >The formal qrayman rebromey the spe die r Prqfenming language with the wee e| produucion Tult: > The anwen >A Syntax amalyzer takes the token from lexi col analyzer OMA greubs thems in suc a way Mak Some pYoqrannnmy Strudune tam be WoInized , — Rete grouping the tokens Hf at all amy syntax Can't be WReomized then Sytahc ere Wil be tf > Thu ovena® prowy $s called syntax checking % language. > This Symtax am be cheded tn the combiler by wring the Shecitations. > SheclHeackigns ki combiler how the Syntax of the pr>qraming Vamguage should be’. % BNF Notation + Developed by John Backs amd feter Naun. weer st Stand tov Backws-Naun form, The: BNF 1S A hotedten technique for context free Gramman® ThU notation fs wWeful ter Specifying the syntax of lamguase. The anF shecifiahien ts as: #= _exbresfon__] _ expo 2 where 1S a non terminal >and —expreion_is Sequom UL gj Yenminals cund non tenminalt. D> 22= means left side gek replaced usith vigh side may OHHH9HSTTHHOHHHRHHHSHHHeHHPPyP ~_amaehnaaneaeoee eee? ’ zip SYepddnon 7 Su= < fallnamer “y ? <.shub> “>. <2 iP an cs= 2 -Hastrwmer “7? gr= cstreetname> (77 < city? = | enumber> o}s]als---)3 * ines of ‘exch Anoly ger t- Lexical analyzer RR A can be Tinple men ted TD und og Following Akepss. : WW) Lexical Analyzer acceb+ Source pwaram cy Input + lt Sam the Source prvytam fev gemevation of tokens. GD We Know requlon exprevions ane Wied to rebraent the input pattern. (Sek ef vules ) ' GD Now Hy input fattinn ts convened indo WEA by Laing , Regula &xprevion finite Aukomada, lobed: | tokemtzed oukbut file exe Anabyzer auw The nfA owe then Conwented into DFA and DEA ane'minimiged by using Akeunt methods oj minimi gahon. (uy) The minimized OFA ae wed +o xRognize . the : Pay amd 2 9 ® brokensinto lexemes. i i i 2 ‘ ve 2 2 9 5 3 Be Ue eee CRETE TTT Eee Program ee ae % sn pagan t (i) Each minimized DEA Is aMotiated with a hak in Peper lamjuage ushich wsdl evaluate the lexemes that match regula aT expreiiion . : 5 or 1 e WiTHEto01 Then consi a slite table for the abbyopriate e, Kinite stake machine amd creates broprann code ushich entaing &) e! Thetable , the evaluation prov amd yuline- de lexical Anahyzen Generabor 3- for efficient duigen of nnn Sere came ee te automate the Pais of combiler. : > LEX Ts cunts Ulility which gunutates a lexical amalyynr ( Finite automata) cutomolfolly ith the hab ef sequiorn exbreyiion. | MAO DODD ODDDMNH MMMM I HH fh Aucomatic gemerabton of lexfool amalyzu—s Dn Re > flutomadtic gemeratfen 3} lextcal Arolayzea is done using LEX Programming language. >The Lex specification Fle cam be demoled using the exleniien oh Lex Sounce progtum Jex-4—] Tex | sang, Compilet p> hex-yy.c a) lex-yyc_ ste . Onpuk Stream ou f-— Sequence of Token, \ 2 SS VPYVUPUVIBIEUOCUUE CHB HYSHEe WES , _{a m ePeeveeve e-> D_ VY vg L ve + OP ts > The above digo shows creating a doxical analyzer With ure uty. he 1 is writien tn the Lex s ° '. ip e Py An Snpur fle , which we cal Bex 1s written Sibir language amd describe the lexical analyeer -to be gemuntited 9 . av, ina lg > The lex combiley teamsform ext to a c Prey 2 | it File that ts alway named fox yyee ng? After thot file 1s compiled by -Cuompiler amd Conyecr @ IKE frto a Hle called aout as alway. pe —P shun seme Inpuk stream is given to aout then cust OJ token, geks qenuraked. tb meams Gouk Up Working A a \exeal amalyzey, CHeuctune of Lee rowan & a Ley programm contiets o4 Hour following pasdy ~ —> Part | —> fork 2 Cefinition e} To kun) — fark 3 xblanation—: The ex blanadien ef aboue 1) Dedlanadion sedis “SS — In dedanokion Sedien , declaration 9 = Gm be done. Some regulan ey imag Bechon, = 2) tramsatiqnal Tues Aedien —? The Tule Sechfon consists ¥ regulon exb- ast}b aro Uotd AcHens, each translotona nue have the form: Pabiténn Ackien} ot RE LAcHon} PrOAYTOMM it following — 7 9 Vesrinbles and Constants mHi6nNA com also wortien fn thy CUUUTY wn iK |

You might also like