You are on page 1of 11

Lecture 19

An Overview of Pipelining

Motivating Example
Do Laundry
Wash Dry Fold Put Away

Total time: 2 hours

Assumption: we can break up Do Laundry into four steps, all of which take 30 minutes to complete.
Wash
30mins

Dry
30mins

Fold
30mins

Put Away
30mins

Wash

Dry Wash

Fold Dry

Put Away Fold

Wash

Dry

Time Task order A B C D

6 PM

10

11

12

2 AM

Time Task order A B C D

6 PM

10

11

12

2 AM

Note: total time is now 3h 30mins. Note: it still takes me 2 hours to complete one load. Note: after the 2nd hour, I finish one load every 30 mins.

The performance gain with pipelining


45000 40000 35000 total time 30000 25000 20000 15000 10000 5000 0 0 2000 4000 6000 number of instructions Non-pipelined Pipelined

The gains with pipelining come from throughput not from reducing the execution time of an individual instruction.

MIPS instruction execution sequence


We can break down the execution of each instruction into a sequence of five different steps: 1. Fetch instruction from memory. 2. Read registers and decode instruction. 3. Execute operation or calculate an address. 4. Access an operand in data memory. 5. Write the result into a register.

Timing for each instruction execution step


Class lw sw R-format Branch Fetch Reg Read ALU Data Reg Op Access Write Total Time

2 2 2 2

1 1 1 1

2 2 2 2

2 2

1 1

8 7 6 5

Question: If we want to build a pipelined CPU for MIPS, how will we deal with the fact that the total execution time varies according to the kind of instruction?

What you could have guessed by now: MIPS was designed for pipelining
All instructions are the same length. Operands are always aligned in memory. Few instruction formats. Few addressing modes (memory operands appear only in load and store instructions).

Adapting the Single-cycle Datapath for Pipelining


IF: Instruction fetch
0 M u x 1

ID: Instruction decode/ register file read

EX: Execute/ address calculation

MEM: Memory access WB: Write back

Add

4 Shift left 2 Read register 1

Add

Add result

PC

Address

Instruction Instruction memory

Read data 1 Read register 2 Registers Read Write data 2 register Write data

0 M u x 1

Zero ALU ALU result

Address Data memory Write data

Read data

1 M u x 0

16

Sign extend

32

Pipeline registers: Intermediate storage Question: How does one determine the width of these registers?

Pipeline Hazards
What if the next instruction cannot execute in the following clock cycle?
Structural hazard: Bad hardware support. We cant execute a combo of instructions in the same clock cycle. Example: memory. Solution: stall, design. Data hazard : An instruction depends on the result result of a previous instruction that is still in the pipeline.. Example: add $s0, $t0, $t1
sub $t2, $s0, $t3

Control hazard: The result of one instruction determines what happens to other instructions.. Example: branch. Solutions: stall, reorder, predict.

Solution: stall, reorder, forwarding (or bypassing).

Changing the Pipeline to Enable Forwarding


Time add $s0, $t0, $t1 IF 2 4 6 8 10 ID EX MEM WB

Program execution Time order (in instructions) add $s0, $t0, $t1 IF

10

ID

EX

MEM

WB

sub $t2, $s0, $t3

IF

ID

EX

MEM

WB

You might also like