You are on page 1of 10
454 ‘chapter 6 Enhancing Performance with Pipelining the system will became the bottleneck, That bottleneck sic of the nex the instruction level is trying muliproce P exploit parallelism at muck coarser levels, Parallel processing is the topic of Bf Chapter 9, which appears on PR) Historical Perspective and Further f Reading rs, the earliest superscalar, the development of out-of-order and specu Exer 62 [5 UF the time for an ALU operation can be shortened by 2 pared to the description in Figute 6.2 on page 373) Will it affect the speedup obtained from pipelining? how much Otherwise, why? by, What if the ALU operation now takes 259% more time? 6.2 [10] <$6.1> A computer architect ne ign the pipeline of a new mi sme itis perfectly pipelined. How mich speedup will it achieve comps 618 Exercises Pr 63 sing a drawin 1 456 Chapter 6 Enhancing Performance with Pipelining, 457 96.4, 6.5> IB sctice: Forwarding in Mem chapter 6 Enhancing Performance with Pipelining 614 Exercises 459 paths required and har es that must be detected, ce of the two operands. The number of cases should equal th f yo} he hazard if no forwarding existed. 6.36 16.6> We hat ram core consisting pnditional bi The program core will be executed thousands of times. Below are me ich for one execution of the program core (T for taken, N for Branch I: 1-1 Branch 2: N-N-N-N Branch 4 T-N-T-N-T-N Branch 4: T-T-T-N-1 ranch 5: T-T-N-P-T-N Asstume the behavior of each branch the same for ram core execu For d schemes, assum pwn prediction butler a ach butter i al to the sume state before each executi tthe prediction or the following branch prediction scher el ctor, initialized to predict take slictor, initialized to weakly predict taken What are the prediction accuracie 6.37 [10] <9$6.4-6.6> Sketch all the forwarding paths for neh inj pw when they must be enabled (as we did on , 6.38 (11 1-6.6> Write the logic tod hazards on h sour dlid on page 410 6.39 [10] <996, he examp shows hos 1 nance on our pipelined dat ith forwarding and stalls or 7 vad. Rewrite the Following code te ize performance on th pa chapter 6 Enhancing Performance with Pipelining 6.40 [20] <$6.6> Consider the pipelined datapath in Figure 6.54 on page 4 attempt to Hush and an attempt to stall occur simultaneously? If so, doth netions? If there are any cooperating sty? Is there a simple change you can make to the datapath to ensure t n ¥ priority? You may want to consider the following code sequence to hel 15 446 or implementing forwarding in Figure 6.7-5 did not consider forwarding ofa result as the value to be stored by instruction. Add this to the Verilog co rot consider forwarding of a ke this simple addition to the Verilog code 6.43 [15] <666.6, 6.7> The Verilog code for implementing branch hazard detect nd stalls in Figure 6.7.3 on does nat detect the possibility of d. all data hazards for branch opera he forwarding and stall logic needed for completin; 6.44 [10] <$96.6, 6.7> Rewrite theVerilog code in 6.7.3 on page 6. implement a delayed branch st 6.45 [20] <$56.6,6.7> Rewrite the verilog code i Sn page 6.7-6-6 to implem "inch target buffer. Assume the buffer is implemented with a mod tle with the following definition: edict rrentPc.n date desti Lake sure you accomodate all three possiblities: a correct prediction, a mis i buffer (that is, miss = true), and an incorrect prediction. In the last two eases, rast also update the prediction Chapter 6 Enhancing Performance with Pipelining 6.46 |] month 4, 6.3-6.8> If you have » a simulation syste as Verilog or ViewLogic, first design i apathy i Chapter 5, Then evolve this design into a pipelined organization, inh sure to run MIPS ach si rns i nitinues to operate correcth 6.9 The following code has been unrolled At not yet sche me th isa multiple of two (ie a multiple of eight u $30, L Schedule this code for fast execution on the standarel MIPS sam it suppe instruction), Assume initial san x ranches are resolved in the MEM stage. How does the schedule mn inst the original unscheduled cod eas 9> This exercise is similar to Exercise 6.4 nis tin corde a nrolled twice (creating th le}. However, iti ot known that the loop index is a multiple of three, and need invent a means of ensuring that the I execu . 7 Jing some code to the beginning or end of the loop that tak not handled by the loc 6.49 [2 > Using the code in Exercise 6.47, unrall the cod ” hedule it for the static multiple-issue version af the MIPS processor des pages 436439, You may assume that the est stout 6.50 [10 Le technology le ature si i ecome real wer (as compared to the logic). As logic be ster the shrinking feature size and Les increase iclays eonsur 7 yeles. ‘That is why the Pentium 1 pipeline stages dedicated to transf ng data along wires from one part of the pipeline to another. What are the ¢ backs to having to add pipe stages for wir 6.51 [30) <96.10 rocessors are intro f sions of textbooks, To keep your textbook curren : ments in this area and write a one-page elab 6.10, Use th Wide Web to exp r of thel 614 Exercises 463 5 4:1, Stall on the LW result 2. Bypass the ADD result, 3. N Answers to Dp | - Check Yourself BH page 6.7-%: Statements 1 oth tr 3s nly statemen npletel f 2 toxall s partly accurat be ; predication:sollwa th prediction: bw oftware: superscalar: hardware; EPIC: both, since there is substant h supports multiple namie scheduling: hard

You might also like