Professional Documents
Culture Documents
Discussion
- 03
An Introductory Analysis of Pipelines
Consider a 5-stage instruction pipeline as shown below:
IF ID EX M WB
A time-space diagram is used to describe the progress of instructions through the pipeline.
WB I1 I2 I3 I4 I5 I6
M I1 I2 I3 I4 I5 I6 I7
stages
EX I1 I2 I3 I4 I5 I6 I7 I8
ID I1 I2 I3 I4 I5 I6 I7 I8 I9
IF I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
1 2 3 4 5 6 7 8 9 10
Clock Cycles
(Pipelined Execution)
We’ve assumed that every stage takes one clock cycle and there are no hazards in the instruction stream.
Instruction Latency (the time it takes to complete an instruction) = 5 cycles
Instruction Throughput = 6/10 IPC = 0.6 IPC
In order to gain better appreciation of pipelined execution, we draw time-space diagram for non-pipelined
execution as shown below:
WB I1 I2
M I1 I2
EX I1 I2
ID I1 I2
IF I1 I2
1 2 3 4 5 6 7 8 9 10
Clock Cycles
(Non-Pipelined Execution)
Page - 1 - of 2
Parallel Processing
Discussion
- 03
time before enhancemen t
S
time after enhancemen t
t np
tp
nk
k 1 n
nk
(3)
k 1 n
Clearly, for a given pipeline, greater speedup is achieved, as more and more instructions are executed. We can
compute the upper bound on speedup as follows:
k
S ideal n
Lim
k 1
1
n
k
We regard it as ideal speedup because its derivation is based on the assumption of no pipeline hazards. As can be
seen, even ideal speedup cannot go beyond pipeline depth (i.e. number of pipeline stages).
Instruction Throughput
Instruction throughput ω is defined as the number of instructions executed per unit time. This is calculated as:
n
(4)
k 1 n
Multiplying numerator and denominator of (4) by k, we can express ω in terms of speedup S as:
S
k
The upper bound on ω is similarly found:
1
ideal n
Lim
k 1
1
n
1/
CPI
Cycles per instruction (CPI) of pipelined
execution can be found as:
CPI
k 1 n
n
k 1
1
n
The lower bound on CPI is
k 1
CPI ideal n
Lim
1
n
1
Page - 2 - of 2