Professional Documents
Culture Documents
Lect3 Pipeline
Lect3 Pipeline
• Instruction execution is extremely complex and Two stage pipeline: FI: fetch instruction
involves several operations which are executed
successively (see slide 2). This implies a large EI: execute instruction
amount of hardware, but only one part of this (machine) cycle 1 2 3 4 5 6 7 8
hardware works at a given moment.
Instr. i FI EI
Instr. i+1 FI EI
• Pipelining is an implementation technique whereby Instr. i+2 FI EI
multiple instructions are overlapped in execution. Instr. i+3
This is solved without additional hardware but only FI EI
by letting different parts of the hardware work for Instr. i+4 FI EI
different instructions at the same time. Instr. i+5 FI EI
Instr. i+6 FI EI
• The pipeline organization of a CPU is similar to an
assembly line: the work to be done in an instruction
is broken into smaller steps (pieces), each of which
takes a fraction of the time needed to complete the
We consider that each instruction takes execution time Tex.
entire instruction. Each of these steps is a pipe
stage (or a pipe segment).
• Pipe stages are connected to form a pipe: Execution time for the 7 instructions, with pipelining:
(Tex/2)✕8= 4✕Tex
Stage 1 Stage 2 Stage n
• Accelleration: 7✕Tex /4✕Tex = 7/4
• The time required to execute a stage and move to
the next is called a machine cycle (this is one or
several clock cycles). The execution of one
instruction takes several machine cycles as it
passes through the pipeline.
Acceleration by Pipelining (cont’d) Acceleration by Pipelining (cont’d)
Six stage pipeline (see also slide 2): • : duration of one machine cycle
FI: fetch instruction FO: fetch operand • n: number of instructions to execute
DI: decode instruction EI: execute instruction • k: number of pipeline stages
CO: calculate operand address WO:write operand • Tk,n: total time to execute n instructions on a
pipeline with k stages
• Sk,n: (theoretical) speedup produced by a pipeline
cycle 1 2 3 4 5 6 7 8 9 10 11 12 with k stages when executing n instructions
Instr. i FI DI CO FO EI WO
Instr. i+1 FI DI COFO EI WO Tkn = k + n – 1
Instr. i+2 FI DI COFO EI WO
- The first instruction takes k ✕ to finish
Instr. i+3 FI DI COFO EI WO
- The following n 1 instructions produce one
Instr. i+4 FI DI COFO EI WO
result per cycle.
Instr. i+5 FI DI COFO EI WO
Instr. i+6 FI DI COFO EI WO On a non-pipelined processor each instruction
takes k ✕ , and n instructions: Tn = n ✕ k ✕
Pipeline Hazards
Acceleration by Pipelining (cont’d)
Penalty: 1 cycle
Before executing its FO stage, the ADD instruction is
• Certain resources are duplicated in order to avoid stalled until the MUL instruction has written the result
structural hazards. Functional units (ALU, FP unit) into R2.
can be pipelined themselves in order to support Penalty: 2 cycles
several instructions at a time. A classical way to
avoid hazards at memory access is by providing
separate data and instruction caches.
After the EI stage of the MUL instruction the result is avail- Penalty: 3 cycles
able by forwarding. The penalty is reduced to one cycle.
Control Hazards (cont’d)
Penalty: 3 cycles
Penalty: 2 cycles
Summary