Professional Documents
Culture Documents
IF
IF
ID
RD
The penalty cannot be completely eliminated via forwarding. One way to eliminate it is to insert an independent instruction between the LW and the XOR. Penalty w/o Forwarding: 3 Cycle Reduction: 2
ID
RD
ALU
ALU
MEM
MEM
WB
WB
2. ADD SW SUB LW r5, r3, r6, r1, r4, r3 0(r2) r4, 0(r2) 0(r2)
There is one hazard here. It is between the SW and the SUB. It can be eliminated with the critical forwarding path is from the output of the RD stage to the input of the MD stage as shown at the right. The addition of this new stage increase the amount of external fragmentation since not every instruction will use the new stage. 3. The actual performance is lower since this equation is an oversimplification. Some reasons for this are as follows. The load cannot be perfectly balanced across all stages. Latch overhead varies to due fan-in and fan-out constraints when connected to the combinational logic in the various stages.
4.
Nonpipelined Latency = Cycle Time = 31 ns Pipelined Cycle Time = 9 ns + 0.5 ns + 1 ns = 10.5 ns Pipelined Latency = 10.5 ns * 5 = 52.5 ns Potential Speedup = 31 / 10.5 = 2.952 Internal Fragmentation = 5 * 10.5 31 = 21.5 External Fragmentation exists, but is minimized. For example, a jump instruction may not require all five states, a load instruction will not require the last (write back) stage, and an instruction that does not access memory will not require the fourth stage.
Stage #1 Latency = 6 ns
10 5
Register File (4 ns) Read Register 1 Read Register 2 Operand 1 Write Register Write Data Operand 2 MUX (1 ns)
32
Latch
1
Stage #3 Latency = 5 ns
32 32
ALU (4 ns)
32
Stage #4 Latency = 6 ns
Stage #5 Latency = 5 ns