You are on page 1of 4

A Comparative Report Between EPIC and VLIW Architecture

INTRODUCTION :
Very Long Instruction Word (VLIW) processors !" #$ are e%amp&es o' architectures 'or which the program provides e%p&icit in'ormation regarding para&&e&ism( )he compi&er identi'ies the para&&e&ism in the program and communicates it to the hardware *y speci'ying which operations are independent o' one another( )his in'ormation is o' direct va&ue to the hardware" since it +nows with no 'urther chec+ing which operations it can start e%ecuting in the same cyc&e( In this report" we introduce the E%p&icit&y Para&&e& Instruction Computing (EPIC) sty&e o' architecture" an evo&ution o' VLIW which has a*sor*ed many o' the *est ideas o' supersca&ar processors" a&*eit in a 'orm adapted to the EPIC phi&osophy( EPIC is not so much an architecture as it is a phi&osophy o' how to *ui&d ILP processors a&ong with a set o' architectura& 'eatures that support this phi&osophy( In this sense EPIC is &i+e RI,C- it denotes a c&ass o' architectures" a&& o' which su*scri*e to a common architectura& phi&osophy( .ust as there are many distinct RI,C architectures (/ew&ett0Pac+ard1s PARI,C" ,i&icon 2raphic1s 3IP, and ,un1s ,PARC) there can *e more than one instruction set architecture (I,A) within the EPIC 'o&d( ( a designer cou&d increase the c&oc+ rate o' the architecture" or increase the average instruction-level parallelism(ILP) o' the architecture( 3odern processor design has 'ocused on e%ecuting more instructions in a given num*er o' c&oc+ cyc&es" that is" increasing ILP( A num*er o' techni6ues may *e used( 7ne techni6ue" pipelining" is particu&ar&y popu&ar *ecause it is re&ative&y simp&e" and can *e used in con8unction with supersca&ar and VLIW techni6ues( A&& modern CP9 architectures are pipe&ined((

B. Pipelining
A&& instructions are e%ecuted in mu&tip&e stages( :or e%amp&e" a simp&e processor may have 'ive stages5 'irst the instruction must *e 'etched 'rom cache" then it must *e decoded" the instruction must *e e%ecuted" and any memory re'erenced *y the instruction must *e &oaded or stored(

A. Instruction-level parallelism
A common design goa& 'or genera&0purpose processors is to ma%imi4e throughput" which may *e de'ined *road&y as the amount o' wor+ per'ormed in a given time( Average processor throughput is a 'unction o' two varia*&es5 the average num*er o' c&oc+ cyc&es re6uired to e%ecute an instruction" and the 're6uency o' c&oc+ cyc&es( )o increase throughput" then"

:ina&&y the resu&t o' the instruction is stored in registers( )he output 'rom one stage serves as the input to the ne%t stage" 'orming a pipe&ine o' instruction imp&ementation( )hese stages are 're6uent&y independent o' each other" so" i' separate hardware is used to per'orm each stage" mu&tip&e instructions may *e ;in '&ight< at once" with each instruction at a di''erent stage in the pipe&ine( Ignoring potentia& pro*&ems" the theoretica& increase in speed is proportiona& to the &ength o' the pipe&ine5 &onger pipe&ines means more simu&taneous in0'&ight instructions and there'ore 'ewer average cyc&es per instruction( )he ma8or potentia& pro*&em with pipe&ining is the potentia& 'or hazards( A ha4ard occurs when an instruction in the pipe&ine cannot *e e%ecuted( /ennessey and Patterson identi'y three types o' ha4ards5

structural hazards" where there simp&y isn=t su''icient hardware to e%ecute a&& para&&e&i4a*&e instructions at once- data hazards" where an instruction depends on the resu&t o' a previous instruction- and control hazards" which arise 'rom instructions which change the program counter (ie" *ranch instructions)( Various techni6ues e%ist 'or managing ha4ards( )he simp&est o' these is simp&y to stall the pipe&ine unti& the instruction causing the ha4ard has comp&eted(

VLIW and supersca&ar approach the ILP pro*&em di''erent&y( )he +ey di''erence *etween the two is where instruction schedu&ing is per'ormed5 in a supersca&ar architecture" schedu&ing is per'ormed in hardware (and is ca&&ed dynamic scheduling" *ecause the schedu&e o' a given piece o' code may di''er depending on the code path 'o&&owed)" whereas in a VLIW schedu&ing is per'ormed in so'tware (static scheduling" *ecause the schedu&e is ;*ui&t in to the *inary< *y the compi&er or assem*&y &anguage programmer)(

VLIW :

B. Superscalar
9sua&&y" the e%ecution phase o' the pipe&ine ta+es the &ongest( 7n modern hardware" the e%ecution o' the instruction may *e per'ormed *y one o' a num*er o' 'unctiona& units( :or e%amp&e" integer instructions may *e e%ecuted *y the AL9" whereas '&oating0point operations are per'ormed *y the :P9( 7n a traditiona&" sca&ar pipe&ined architecture" either one or the other o' these units wi&& a&ways *e id&e" depending on the instruction *eing e%ecuted( 7n a superscalar architecture" instructions may *e e%ecuted in para&&e& on mu&tip&e 'unctiona& units( )he pipe&ine is essentia&&y sp&it a'ter instruction issue( VLIW architecture

C. Interloc ing
Another architecture 'eature present in some RI,C and VLIW architectures *ut never in supersca&ar1s is &ac+ o' inter&oc+s( In a pipe&ined processor" it is important to ensure that a sta&& somewhere in the pipe&ine won=t resu&t in the machine per'orming incorrect&y( )his cou&d happen i' &ater stages o' the pipe&ine do not detect the sta&&" and thus proceed as i' the sta&&ed stage had comp&eted( )o prevent this" most architectures incorporate interloc s on the pipe&ine stages( Removing inter&oc+s 'rom the architecture is *ene'icia&" *ecause they comp&icate the design and can ta+e time to set up" &owering the overa&& c&oc+ rate( /owever" doing so means that the compi&er (or assem*&y0&anguage programmer) must +now detai&s a*out the timing o' pipe&ine stages 'or each instruction in the processor" and insert >7Ps into the code to ensure correctness( )his ma+es code incredi*&y hardware0 speci'ic( Both the architectures studied in detai& *e&ow are 'u&&y inter&oc+ed" though ,un=s i&&0'ated 3A.C architecture was not" and re&ied on 'ast" universa& .I) compi&ation to so&ve the hardware pro*&ems(

A&& this additiona& hardware is comp&e%" and contri*utes to the transistor count o' the processor( A&& other things *eing e6ua&" more transistors e6ua&s more power consumption" more heat" and &ess on0die space 'or cache( )hus it seems *ene'icia& to e%pose more o' the architecture=s para&&e&ism to the programmer( )his way" not on&y is the architecture simp&i'ied" *ut programmers have more contro& over the hardware" and can ta+e *etter advantage o' it( VLIW is an architecture designed to he&p so'tware designers e%tract more para&&e&ism 'rom their so'tware than wou&d *e possi*&e using a traditiona& RI,C design( It is an a&ternative to *etter0+nown supersca&ar architectures( VLIW is a &ot simp&er than supersca&ar designs" *ut has not so 'ar *een commercia&&y success'u&( :igure shows a typica& VLIW architecture( >ote the simp&i'ied instruction decode and dispatch(

A. ILP in VLIW

Intel Itanium (EPIC):

Itanium is *ased around the e%p&icit&y0para&&e& instruction computer (EPIC) architecture" a 'air&y recent architecture that emerged" circa ?@@A" 'rom /ew&ett0 Pac+ard=s P&ayBoh research architecture( )he EPIC architecture is *ased on VLIW" *ut was designed to overcome the +ey &imitations o' VLIW(in particu&ar" hardware dependence) whi&e simu&taneous&y giving more '&e%i*i&ity to compi&er writers( ,o 'ar the on&y imp&ementation o' this architecture is as part o' the IA0 CD processor architecture in the Itanium 'ami&y o' processors(

A. Instruction !undles
)he ma8or pro*&em addressed *y EPIC is hardware dependence( VLIW is designed around the concept that the &imits o' a processor=s para&&e&ism is addressed *y a sing&e instruction word( )hus" processors capa*&e o' a greater degree o' para&&e&ism re6uire a di''erent instruction set( EPIC=s so&ution to this pro*&em is to de'ine severa& reasona*&y0a*stract categories o' mini0instructions" such as AL9 operations" '&oating0 point operations" and *ranches( 3ini0instructions are com*ined in groups o' three into a !undle( In addition to three D?0*it mini0instructions" *und&es contain a E0*it template type 'or a tota& *und&e si4e o' ?!F *its(

memory (3) and integer AL9 (I) mini0instructions( higher0num*ered *und&es contain di''erent types o' mini0instructions( )emp&ate G contains one memory and two integer instructions and no stops" meaning that a se6uence o' temp&ate0G *und&es wi&& *e e%ecuted in para&&e& as much as the hardware is capa*&e" whereas temp&ate ? contains a stop a'ter the second integer instruction *ut is otherwise identica&( )he hardware ensures that a&& instructions *e'ore the stop have *een retired *e'ore e%ecuting instructions a'ter the stop( )o put it another way" compi&ers simp&y target a theoretica& processor with support 'or an in'inite amount o' para&&e&ism (or" at &east" a register0&imited amount o' para&&e&ism)" and the imp&ementation per'orms as much as it can( :or e%amp&e" a&& current Itaniums issue two *und&es at a time through a techni6ue +nown as dispersal( )he 'irst part o' a *und&e is issued" and then the *und&e is &ogica&&y shi'ted so that the ne%t part o' the *und&e is avai&a*&e 'or e%ecution( I' the mini0instruction cannot *e e%ecuted" split issue occurs( )he *und&e continues to occupy its *und&e s&ot" and another *und&e is &oaded to occupy the ne%t s&ot( ,ince some instructions 'rom the *und&e have *een e%ecuted" &eaving them in the *und&e s&ot reduces para&&e&ism( EPIC trades this per'ormance decrease against the re&ative&y simp&e hardware re6uired to imp&ement dispersion

C. Pro!lems #ith "PIC


shows the genera& 'ormat o' an EPIC *und&e( Bespite the advantages o' EPIC over VLIW" IA0 CD does not so&ve a&& o' VLIW=s pro*&ems( )he 'oremost pro*&em is program si4e5 It is not a&ways possi*&e to comp&ete&y 'i&& a&& s&ots in a *und&e" and empty s&ots are 'i&&ed with >7Ps (IA0CD does not per'orm compression 'urther than that o''ered *y *und&e temp&ates)( As discussed a*ove" increases in code si4e negative&y impact cache per'ormance and resu&t in more *us tra''ic( Itanium compensates 'or this *y using &arge" 'ast caches on0die( Cache is re&ative&y easy to add to Itanium" *ecause the &ac+ o' hardware dedicated to specu&ation" dynamic schedu&ing and the &i+e resu&ts in a sma&& core si4e( /owever cache increases die si4e and power consumption H though cache consumes 'ar &ess power than core &ogic( Another pro*&em common to VLIWs in genera& is the importance o' compi&er optimi4ation( Poor compi&er support can signi'icant&y impact the per'ormance o' EPIC code( /istorica&&y this has *een a pro*&em 'or Itanium" *ut shou&d improve in the 'uture as compi&er support improves(

B. ILP in "PIC
Crucia&&y" the &ength o' a *und&e does not de'ine the &imits o' para&&e&ism- the *und&e temp&ate type indicates" *y the presence or a*sence o' a stop" whether instructions 'o&&owing the *und&e can e%ecute in para&&e& with instructions in the *und&e( )he c&aim o' EPIC is that as the processor 'ami&y evo&ves" processors with greater support 'or para&&e&ism wi&& simp&y issue more *und&es simu&taneous&y(

)he 'irst 'our EPIC *und&e temp&ates( ,tops are indicated *y asteris+s( i&&ustrates the 'irst 'our EPIC *und&e temp&ates" o' the #! avai&a*&e( >ote that the 'irst D *und&es a&& contain

$. Conclusions

EPIC architecture is much *etter than VLIW Architecture" *ecause the idea o' EPIC Architecture is *ased on VLIW" EPIC was designed to overcome the +ey &imitations o' VLIW *ecause it hardware dependency( ,o 'ar the on&y imp&ementation o' this architecture is as part o' the IA0CD processor architecture in the Itanium 'ami&y o' processors( VLIW has yet to see signi'icant commercia& success in genera&0purpose computers( 7ne reason 'or this is *ac+wards0 compati*i&ity issues" which newer architectures" such as EPIC" are starting to address( Another potentia& pro*&em 'acing VLIWs is the widening gap *etween CP9 per'ormance and memory *andwidth( Perhaps updates to the EPIC architecture" or some 'uture VLIW0*ased architecture" wi&& see optiona& support 'or some 'orm o' instruction compression" to reduce Itanium=s re&iance on caching 'or per'ormance(

%e&erences'
http5IIen(wi+ipedia(orgIwi+iIE%p&icit&yJpara& &e&JinstructionJcomputingJEPIC http5IIen(wi+ipedia(orgIwi+iIVeryJ&ongJinst ructionJwordJVLIW http5IIwww(cse(unsw(edu(aIJdaisyI www(goog&e(com IA0CD App&ication Beve&oper1s Architecture 2uide ( (Inte& Corporation" ?@@@)

You might also like