You are on page 1of 4

HW #5 – Due 13/10

1) A computer architect needs to design the pipeline for a new microprocessor.


They have an example workload program with 106 instructions. Each
instruction takes 140 ps from start to finish.
a) Execution time = # instructions * cycle time
10,000,000 instructions * 140 ps per instruction = 1.4E-3 sec
b) Perfect pipelining assumes we can issue one instruction immediately after
another. After 10,000,000 cycles of issuing instructions, we have to wait 13
cycles before the last instruction exits the pipeline completely (drain time).
Perfect pipelining also assumes we take the 140 ps required per instruction
earlier and split it up into 14 even stages. Thus, the cycle time is now 10 ps.
Execution time = ((CPI * # of instructions) + drain time) * cycle time

((1.0 *10,000,000) + 13) * 10ps  1.0E-4 sec


Speedup = Execution time B / execution time A = 14x faster
c) Instruction latency might be affected since it is unlikely we will be able to
split the 140 ps original cycle time into 14 equal size chunks. The cycle
time will probably be greater than 10 ps.
Instruction throughput might be affected since a CPI of 1.0 is very
optimistic. It is likely that some no-ops will need to be inserted because the
compiler will not be able to fill all delay slots and maintain correct behavior.
2) Identify all of the data dependencies in the following code. Which
dependencies are data hazards that will be resolved via forwarding? Which
dependencies are data hazards that will cause a stall?
add $3, $3, $2
sub $5, $3, $1
lw $6, 200, ($3)
add $7, $3, $6
ADD $3 $4 $2
SUB $5 $3 $1
LW $6 200 ($3)
ADD $7 $3 $6

A) The SUB requires $3 from the first ADD – Read after Write
B) The LW requires $3 from the first ADD – Read after Write
C) The second ADD requires $3 from the first ADD – Read after Write
D) The second ADD also requires $6 from the LW – Read after Write

ADD IF DEC
EX MEM WB
$3
SUB IF DEC EX MEM WB
$3
LW IF DEC EX MEM WB

NOOP IF DEC EX MEM WB


$6
ADD IF DEC EX MEM WB

A) Can be handled by forwarding the result from the ALU of the EXEC stage of
the ADD instruction to the end of the DEC stage of the SUB instruction.
B) Can be handled by forwarding the result of the ALU passed on to the MEM
stage of the ADD instruction to the end of the DEC stage of the LW instruction.
C) This does not need to be forwarded. Even if D did not require the insertion of a
NOOP, our careful clocking of the register file allows us to write and read back
a value during the same clock cycle. Thus, we don’t need to worry about
dependencies that are more than 2 instructions later.
D) This dependency requires us to insert a NOOP. The grey arrow shows that
since we do not have the value from memory until the end of the MEM stage of
the LW, we do not get the value early enough to forward into the DEC stage of
the following instruction.
Even adding forwarding hardware into the EX stage would not help us since we
do not know the result until the end of the MEM stage. Adding forwarding
capabilities to the EX stage would require a longer cycle time to allow us
enough time to get the forwarded value and then perform the addition.
3) The following code contains a “read after write” data hazard that is resolved by
forwarding:
add $2, $3, $4
add $5, $2, $6
Consider the following code where a memory read occurs after a memory write:
sw $7, 100($2)
lw $8, 100 ($2)
Does the code work correctly on the processor in class? Why/why not? Will
the forwarding unit need to be altered to handle this code?
This works with the processor designed in class because both the SW and LW
access the memory 4 cycles after they have been fetched. Thus, since the LW
is fetched one cycle after the SW, the LW looks for the value one cycle after the
SW has written it. Assuming the memory can be written to in one cycle and we
have no buffering this will work fine.
4) Consider executing the following code on the pipelined datapath from class:
add $2, $3, $1
sub $4, $3, $5
add $5, $3, $7
add $7, $6, $1
add $8, $2, $6
At the end of the fifth cycle of execution, which registers are being read and
which registers will be written?

ADD $2, $3, $1 IF DEC EX MEM WB


SUB $4, $3, $5 IF DEC EX MEM WB
ADD $5, $3, $7 IF DEC EX MEM WB
ADD $7, $6, $1 IF DEC EX MEM WB
ADD $8, $2, $6 IF DEC EX MEM WB

At the end of the fifth cycle:


A) The first ADD is completing its write to $2
B) The SUB is in the MEM stage. It is not currently accessing any registers – if
anything it would be looking up something in memory.
C) The second ADD is in the EXEC stage. It is currently not accessing any
registers – it is currently trying to compute the add.
D) The third ADD is in the DEC stage so it is fetching the operands from $6
and $1.
E) The fourth ADD is in the IF stage. We don’t know what kind of instruction
this is yet since we just got it so we don’t even know if this instruction takes
any registers at all.

You might also like