CH7-Parallel and Pipelined Processing

Tikrit University The academic year 2019-2020
College of Petroleum Process Eng.

Petroleum and Control Eng. Dept.. Course:Computer Architecture1
Chapter7: Parallel and Pipelined

Processing
Basic Ideas
• Parallel processing • Pipelined processing
time time
P1 a1 a2 a3 a4 P1 a1 b1 c1 d1
P2 b1 b2 b3 b4 P2 a2 b2 c2 d2
P3 c1 c2 c3 c4 P3 a3 b3 c3 d3
P4 d1 d2 d3 d4 P4 a4 b4 c4 d4
Less inter-processor communication More inter-processor communication

Complicated processor hardware Simpler processor hardware
different types of operations performed

a, b, c, d: different data streams processed
Data Dependence
• Parallel processing requires NO • Pipelined processing will

data dependence between involve inter-processor
processors communication
P1 P1
P2 P2
P3 P3
P4 P4
time time
Basic Pipeline
Five stage “RISC” load-store architecture
1. Instruction fetch (IF)

• get instruction from memory, increment PC
2. Instruction Decode (ID)
• translate opcode into control signals and read registers
3. Execute (EX)
• perform ALU operation, compute jump/branch targets
4. Memory (MEM)
• access memory if needed
5. Writeback (WB)
• update register file
Time Graphs
Clock cycle
1 2 3 4 5 6 7 8 9
add IF ID EX MEM WB
lw IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
Latency: 5 cycles
Throughput: 1 instr/cycle
Concurrency: 5 CPI = 1
Cycles Per Instruction (CPI)
• Instruction mix for some program P, assume:

• 25% load/store ( 3 cycles / instruction)
• 60% arithmetic ( 2 cycles / instruction)
• 15% branches ( 1 cycle / instruction)
• Multi-Cycle performance for program P:

• 3 * .25 + 2 * .60 + 1 * .15 = 2.1
• average cycles per instruction (CPI) = 2.1
SIX STAGE OF INSTRUCTION PIPELINING
 Fetch Instruction(FI)
Read the next expected instruction into a buffer
 Decode Instruction(DI)
Determine the opcode and the operand specifiers.
 Calculate Operands(CO)
Calculate the effective address of each source operand.
 Fetch Operands(FO)
Fetch each operand from memory. Operands in registers need
not be fetched.
 Execute Instruction(EI)
Perform the indicated operation and store the result
 Write Operand(WO)
Store the result in memory.
Timing diagram for instruction pipeline
operation
Six-stage CPU instruction pipeline
6
Pipeline Performance: Clock & Timing
Si Si+1
m d
Clock cycle of the pipeline :
Latch delay : d
= max { m }+d
Pipeline frequency : f
f=1/
Advantages
• Pipelining makes efficient use of resources.

• Quicker time of execution of large number of
instructions
• The parallelism is invisible to the programmer.
Speed Up Equation for Pipelining
For simple RISC pipeline, CPI = 1:

Reduced Instruction Set Computers(
(RISC) Pipelining
(RISC)Pipelining
• Key Features of RISC
– Limited and simple instruction set
– Memory access instructions limited to memory <-> registers
– Operations are register to register
– Large number of general purpose registers
(and use of compiler technology to optimize register use)
– Emphasis on optimising the instruction pipeline
(& memory management)
– Hardwired for speed (no microcode)
Memory to Memory vs Register to Memory
Operations
• (RISC uses only Register to memory)

RISC Pipelining Basics
• Define two phases of execution for register based instructions
– I: Instruction fetch
– E: Execute
• ALU operation with register input and output
• For load and store there will be three
– I: Instruction fetch
– E: Execute
• Calculate memory address
– D: Memory
• Register to memory or memory to register operation
For simple RISC pipeline, CPI = 1:

Effects of RISC Pipelining
(b) Three stage pipelined timing

Optimization of RISC Pipelining
• Delayed branch
– Leverages branch that does not take effect until
after execution of following instruction
‫زفحي عرف يذ ل لخدي زيح ذيفنت ىتح دعب ذيفنت عت مي ت ت ةي‬
– The following instruction becomes the delay slot
Normal vs Delayed Branch

CH7-Parallel and Pipelined Processing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH7-Parallel and Pipelined Processing

Uploaded by

Copyright:

Available Formats

Tikrit University The academic year 2019-2020

College of Petroleum Process Eng.

Chapter7: Parallel and Pipelined

Less inter-processor communication More inter-processor communication

different types of operations performed

• Parallel processing requires NO • Pipelined processing will

1. Instruction fetch (IF)

• Instruction mix for some program P, assume:

• Multi-Cycle performance for program P:

Clock cycle of the pipeline :

• Pipelining makes efficient use of resources.

For simple RISC pipeline, CPI = 1:

• (RISC uses only Register to memory)

For simple RISC pipeline, CPI = 1:

(b) Three stage pipelined timing

You might also like