a
[1SIIS/ISAS2S/10/15] Use the following code fragment:
Loop: wo RI,0(R2) Toad RE from address O¥R2
DADDE —RNRL,#1 GR eRIFL
0 1,0, (82) store RI at address 0482
DAODT ROR, 444 RO=RZA
sus ERR RUKIA
BNEZ ——RE,Loop branch to Loop if RAI=0
Assume tht the initial value of R3 is R2-+ 396,
[15] Data hazards are caused by data dependences in the code.
‘Whethor a dependency causes a hazard depends on the machine implementa=
tion (ie, sumer of pipeline stages) List all of the data dependences in the
code above. Record the regs, souree instruction, and destination instruc
tion; for example, there is data dependency for register RI from the LD to
the DADO
[15] <€.2> Show the timing ofthis instruction sequence for the stage RISC
Pipeline without any forwarding or bypassing hardvare Hut assoming that
register rea and write inthe same cleck cycle “forwards” through the reg
ister fleas shown in Figure C6. Use a pipeline timing chart like that in is-
we C.5, Assume thatthe branch is handled by Mushing the pipeline. I all
‘memory references take | cycle, how many eyles does this lop take to exo-
cute?
chart Like tht shown in Figure C.5. Assume thatthe brane is handled by
‘Predicting it as not taken I all memory reerenees take I eyele, how many
_yetes does this loop take to execute?
[15] Show the dining ofthis instruction sequence for dhe Stage RISC
Pipeine with fll forwarding and bypassing hardware. Use a pipeine timing
{Shar ike that shown ia igre C5 Assume thatthe Branch i handled by
predicting ita taken. Ill memory references ake I eyce, how many cycles
does tis lop take 10 execute?
[25] High-pecformance processor have very deep pipelines more
than 15 stiges Imagine tht you havea 10-stage pipeline in which every stage
‘of the 5.stage pipeline has heen split in two. The only eae is ha, for data
{oewardng, dala are forwarded from the cad of a pair of stages 1 the hegin-
fing ofthe two stages where they are needed. For example, data are forwarded
‘rom the output ofthe seeond exceute stage to the input of the fis execute
sage, sill causing a F-eycle delay. Show the timing of this instruction
“sequence forthe l-stage RISC pipeline with full forwarding and bypassing
‘hardware. Use a pipeline timing chart like tht shown in Figure C5. Assume2
‘that the branch is hanes by proticting it as taken. Hall memory references
lake 1 eyele, how many cycles does this lop ake to execute?
(10) Assume that inthe S-stage pipeline the longest sage requires
7s, an the pipeline reiscr delay is 0.1 as. What is the clock eyele time of
the Stage pipeline? IF the 10-tage pipeline spits all tages in hal, what is
‘the cycle time ofthe 10-siago machine?
19 [15] Using your answers fm parts (A) and (e), determine the cycles
pe instruction (CPI) for the loop om 2 S-staze pipeline and a 10-stage pipe-
Tine. Make sure you count ony from when the first instruction reaches the
‘wmiteshack tage tothe end. Do not count the start-up ofthe fs nstaction.
Using the clock eycle time calculated in part (), caleulate the average
instruction exccut ime foreach machine.
[15715] Suppose the branch frequencies (as pereetages of al instructions)
‘area follows:
Conditional tranches 13%
Jumps and calls r
‘Taken conditional branches 60% are taken
& [15] We are examining a fours pipeline where the ranch is
resolved af the end ofthe second cycle for unconditional ranches and a the
‘end of the tind eyele for conditional branches. Assuming that only the fst
pipe stage cam always be done independent of whether the branch goes und
‘ignoring other pipeline stalls, how much faster would the machine be without
any branch haranis?
1b [15] Now assume a high-performance processor in which we have a
|S-dcep pipeline where the branch is resold atthe en of the fifth eycle for
‘unconditional branches and at the end of the tenth eycle for conditional
‘ranches. Assuming that only the fist pipe stage can always be done inde-
‘pendent of whether the branch goes and ignoring other pipeline stalls how
re faster would the machine he without any branch hazards?