This action might not be possible to undo. Are you sure you want to continue?
Lecture 13 Notes on RISC-Pipelining.
RISC(Recap) y Reduced Instruction Set Computer y Key features y Large number of general purpose registers y Or use of compiler technology to optimize register use y Limited and simple instruction set y Emphasis on optimising the instruction pipeline 2 Lecture-13-Notes-ON-RISC-PIPELINING .
RISC Characteristics(Recap) y One instruction per cycle y Register to register operations y Few. simple instruction formats y Hardwired design (no microcode) y Fixed instruction format y More compile time/effort 3 Lecture-13-Notes-ON-RISC-PIPELINING . simple addressing modes y Few.
RISC Characteristics y One instruction per cycle y One machine instruction per machine cycle y A machine cycle : the time it takes to fetch two y operands from registers. perform an ALU y operation and store the result in a register y Register to register operations y Most operations should be register-to-register with only simple LOAD and STORE operations y The design feature simplifies the instruction set and the control unit 4 Lecture-13-Notes-ON-RISC-PIPELINING .
simple addressing modes Almost all instructions use register addressing Several additional modes Displacement PC-relative y Few. simple instruction formats y Only one or a few formats are used y Instruction length is fixed and aligned on word boundaries 5 Lecture-13-Notes-ON-RISC-PIPELINING .RISC Characteristics y Few.
a delay is required This delay can be accomplished by a NOOP 6 Lecture-13-Notes-ON-RISC-PIPELINING .RISC Pipelining(Recap) Most instructions are register to register Two phases of execution I : Instruction fetch E: Execute ALU operation with register input and output For load and store I : Instruction fetch E: Execute Calculate memory address D: Memory Register to memory or memory to register operation If an instruction needs an operand that is altered by the preceding instruction.
Sequential Operation Vs Two Way Pipelines Sequential operation is obviously in-efficient. Two-way pipelined I and E stages of two different instructions can be performed simultaneously Yields up to twice the execution rate of sequential Problems Causes wait state with accesses to memory Branch disrupts flow (NOOP instruction can be inserted by assembler or compiler) 7 Lecture-13-Notes-ON-RISC-PIPELINING .
this is not as difficult to do. Since E is usually longer. break E into two parts E1 ² Register file read E2 ² ALU operation and register write Because of RISC design.Three way Pipelined Vs Four Way Pipelined Permitting two memory accesses at one time allows for fully pipelined operation (dual-port RAM). Up to four instructions can be under way at one time (potential speedup of 4) 8 Lecture-13-Notes-ON-RISC-PIPELINING .
Optimization of Pipelining y Data and branch dependencies reduce the overall execution rate y Delayed branch y Does not take effect until after execution of following instruction y ´Thisµ following instruction is the delay slot 9 Lecture-13-Notes-ON-RISC-PIPELINING .
y NOOP can be used if instruction cannot be found to execute after JUMP.Delayed Branches? y Traditional pipelining disposes of instruction loaded in pipe after branch. y This makes it so no special circuitry is needed to clear the pipe. y Delayed branching executes instruction loaded in pipe after branch. y It is left up to the compiler to rearrange instructions or add NOOPs 10 Lecture-13-Notes-ON-RISC-PIPELINING .
it idles until the load is complete. the compiler must refrain from doing the interchange and instead insert a NOOP. y The scheduling of instructions for the pipeline and the dynamic allocation of registers should be considered together to achieve the greatest efficiency 11 Lecture-13-Notes-ON-RISC-PIPELINING .Delayed Branches? y The interchange of instructions will work successfully for unconditional branches calls and returns y Cannot be blindly applied for conditional branches y In the condition that is tested for. the register that is to be the target of the load is locked by the processor y The processor continues execution of the instruction stream until it reaches an instruction requiring that register y At that point. the branch can be altered by the immediately preceding instruction. y Delayed load can be used on LOAD instructions y On the LOAD instruction.
Delayed Branches? 12 Lecture-13-Notes-ON-RISC-PIPELINING .
more instructions can be in the pipeline at the same time. increasing parallelism 13 Lecture-13-Notes-ON-RISC-PIPELINING . pipeline stages y With more stages.Instruction Pipeline y Two classes of processors have evolved to offer execution of multiple instructions per clock cycle y Super Scalar architecture y Replicates each of the pipeline stages so that two or more instructions at the same stage of the pipeline can be processed simultaneously. y Super Pipelined architecture y Makes use of more fine-grained.
y Overhead logic is required to coordinate these dependencies y With super pipelining y Overhead associated with transferring instructions from one stage to the next 14 Lecture-13-Notes-ON-RISC-PIPELINING .Instruction Pipeline y Both approaches have limitations y With superscalar architecture y Dependencies between instructions in different pipelines can slow down the system.