Professional Documents
Culture Documents
Pipelining
Pipelining
1 / 19
The Basic Concept
2 / 19
Typical Non-Pipelined Execution
Stage 3 I0 I1 I2
Stage 2 I0 I1 I2
Stage 1 I0 I1 I2
Stage 0 I0 I1 I2 I3
11111111
00000000
Time
3 / 19
Idealized Pipelined Execution
Stage 3 I0 I1 I2 I3 I4 I5 I6 I7 I8 I9
Stage 2 I0 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
Stage 1 I0 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11
Stage 0 I0 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 I12
11111111
00000000
Time
4 / 19
Pipeline Turbulence
Stage 3 I0 I1 I2 I3 I4 Ibr Ik
Stage 2 I0 I1 I2 I3 I4 Ibr Ik Ik+1
Stage 1 I0 I1 I2 I3 I4 Ibr Ik Ik+1 Ik+2
Stage 0 I0 I1 I2 I3 I4 Ibr Ik Ik+1 Ik+2 Ik+3
11111111
00000000
Time
1
Speedup from pipelinining = 1+Pipeline stall cycles × Pipeline depth
per instruction
Branch instructions introduce control hazards into the pipeline, negatively impacting performance. There are several
types of hazards, namely control, data, and structural.
5 / 19
Pipeline Hazards
6 / 19
Structural Hazards
7 / 19
Data Hazards
8 / 19
Control Hazards
9 / 19
Branch Prediction
1
The instructions following the branch do not cause changes to the visible machine state, or the changes made
can easily be undone.
2
There are numerous mechanisms to implement the branch-prediction buffer. Simple ones are 1-bit or 2-bit
predictors; more complex techniques are also possible.
10 / 19
An example 2-bit (dynamic) predictor
Taken
Not taken
Predict taken Predict taken
11 10
Taken
Not taken
Predict not taken Predict not taken
01 00
Taken
Not taken
11 / 19
1%
nasa7
0%
0% 4096 entries:
matrix300 2 bits per entry
0%
Unlimited entries:
2 bits per entry
1%
tomcatv
0%
5%
doduc
5%
9%
spice
9%
9%
fpppp
9%
SPEC89 benchmarks
12%
gcc
11%
5%
espresso
Effectiveness of dynamic prediction
5%
18%
eqntott
18%
10%
li
10%
13 / 19
Precise Exceptions
14 / 19
Multicycle Operation
EX
Integer unit
EX
FP/integer
multiply
IF ID MEM WB
EX
FP adder
EX
FP/integer
divider
15 / 19
Implementation Support for Multicycle Operation
Integer unit
EX
FP/integer multiply
M1 M2 M3 M4 M5 M6 M7
IF ID MEM WB
FP adder
A1 A2 A3 A4
FP/integer divider
DIV
16 / 19
Multicycle Operation
17 / 19
Out-of-Order Execution
18 / 19
Dynamically Scheduled Pipelines
19 / 19