You are on page 1of 2

Parallel Processing

Discussion
- 03
An Introductory Analysis of Pipelines
Consider a 5-stage instruction pipeline as shown below:

IF ID EX M WB

A time-space diagram is used to describe the progress of instructions through the pipeline.

WB I1 I2 I3 I4 I5 I6
M I1 I2 I3 I4 I5 I6 I7
stages

EX I1 I2 I3 I4 I5 I6 I7 I8
ID I1 I2 I3 I4 I5 I6 I7 I8 I9
IF I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
1 2 3 4 5 6 7 8 9 10
Clock Cycles 
(Pipelined Execution)

We’ve assumed that every stage takes one clock cycle and there are no hazards in the instruction stream.
Instruction Latency (the time it takes to complete an instruction) = 5 cycles
Instruction Throughput = 6/10 IPC = 0.6 IPC
In order to gain better appreciation of pipelined execution, we draw time-space diagram for non-pipelined
execution as shown below:
WB I1 I2
M I1 I2
EX I1 I2
ID I1 I2
IF I1 I2
1 2 3 4 5 6 7 8 9 10
Clock Cycles 
(Non-Pipelined Execution)

Instruction Latency = 5 cycles


Instruction Throughput = 2/10 IPC = 0.2 IPC (instructions per cycle)
Thus pipelined execution improves instruction throughput. However, it doesn’t improve instruction latency. In
practice, pipelining increases instruction latency due to delay of pipeline registers
Speedup
Suppose that a k-stage instruction pipeline executes a program containing n instructions. Let τ be the cycle time.
Execution time on non-pipelined computer is given as
tnp = nkτ ----------(1)
Execution time on pipelined computer is given as
tp = (k – 1 + n)τ ----------(2)
where, (k – 1) cycles are required to fill up the pipeline (also called pipeline setup time). By definition, speedup S
of pipelined execution over non-pipelined execution is given as

Page - 1 - of 2
Parallel Processing
Discussion
- 03
time before enhancemen t
S
time after enhancemen t
t np

tp
nk

k  1  n 
nk
          (3)
k 1 n
Clearly, for a given pipeline, greater speedup is achieved, as more and more instructions are executed. We can
compute the upper bound on speedup as follows:
k
S ideal  n  
Lim
k 1
1
n
k
We regard it as ideal speedup because its derivation is based on the assumption of no pipeline hazards. As can be
seen, even ideal speedup cannot go beyond pipeline depth (i.e. number of pipeline stages).
Instruction Throughput
Instruction throughput ω is defined as the number of instructions executed per unit time. This is calculated as:

n
          (4)
k  1  n 

Multiplying numerator and denominator of (4) by k, we can express ω in terms of speedup S as:

S

k
The upper bound on ω is similarly found:
1
 ideal  n  
Lim
 k 1 
  1
 n 
 1/
CPI
Cycles per instruction (CPI) of pipelined
execution can be found as:
CPI 
k  1  n 
n
k 1
 1
n
The lower bound on CPI is

 k 1 
CPI ideal  n  
Lim
 1
 n 
1
Page - 2 - of 2

You might also like