Professional Documents
Culture Documents
Chapter4 Processor
Chapter4 Processor
Instruction [5–0]
Add Add
Data
Register #
PC Address Instruction Registers ALU Address
Register #
Data
Instruction
memory
memory Register #
Data
Data
Register #
PC Address Instruction Registers ALU Address
Register #
Data
Instruction
memory
memory Register #
Data
M
u
x
Add Add M
u
x
ALU operation
Data
MemWrite
Register #
PC Address Instruction Registers ALU Address
Register # M Zero
u Data
Instruction
x memory
memory Register # RegWrite
Data
MemRead
Control
D Q Write
Clock cycle
op rs rt constant or address
6 bits 5 bits 5 bits 16 bits
op address
6 bits 26 bits
▪ Note:
▪ All MIPS instructions are 32-bit wise
▪ All MIPS instructions contain 6-bit OP
(most significant)
8/19/2021 Faculty of Computer Science and Engineering 17
Instruction Fetch
Add
Read Increment by
PC
address 4 for next
instruction
Instruction
Add
4
add $t1, $s0, $t0
lbu $t2, 0($t1) Increment by
PC add $t3, $s0, $a0
sb $t2, 0($t3)
4 for next
beq $t2, $zero, exit instruction
addi $s0, $s0, 1
j loop
...
a. Registers b. ALU
Read
Address
data
16 32
Sign-
Data extend
Write memory
data
MemRead
a. Data memory unit b. Sign extension unit
8/19/2021 Faculty of Computer Science and Engineering 23
Branch Instructions
▪ Read register operands
▪ Compare operands
▪ Use ALU, subtract and check Zero output
▪ Calculate target address
▪ Sign-extend displacement
▪ Shift left 2 places (word displacement)
▪ Add to PC + 4
▪ Already calculated by instruction fetch
16 Sign- 32
extend
Sign-bit wire
replicated
8/19/2021 Faculty of Computer Science and Engineering 25
Composing the Elements
▪ First-cut data path does an instruction in
one clock cycle
▪ Each datapath element can only do one
function at a time
▪ Hence, we need separate instruction and data
memories
▪ Use multiplexers where alternate data
sources are used for different instructions
8/19/2021 Faculty of Computer Science and Engineering 26
R-Type/Load/Store Datapath
▪ add $S0, $a0, $t0
Read ALU operation
register 1 4
Read MemWrite
data 1
Read MemtoReg
register 2 Zero
Instruction ALUSrc
Registers Read ALU ALU Read
Write 0 Address 1
data 2 result data M
register M
u u
x x
Write 1 0
data Data
Write memory
RegWrite data
16 32 MemRead
Sign-
extend
16 32 MemRead
Sign-
extend
16 32 MemRead
Sign-
extend
M
Add u
x
ALU
4 Add result
Shift
left 2
Read
PC
Read register 1
ALUSrc 4 ALU operation
address Read MemWrite
Read data 1
Zero MemtoReg
register 2
Instruction Registers Read ALU ALU Read
Write Address
Instruction data 2 M result data M
register u
memory u
x x
Write
data
Write Data
RegWrite data memory
16 32 MemRead
Sign-
extend
0 0 01 OR 1
Result
0 0 10 ADD b
0 + 2
0 1 10 SUB 1
0 1 11 SLT CarryOut
1 1 00 NOR
Without SLT implementation
0 0 01 OR 1
Result
0 0 10 ADD b
0 + 2
0 1 10 SUB 1
Less 3
0 1 11 SLT
Set
1 1 00 NOR Overflow
detection Overflow
16 32
Instruction [15–0] Sign ALU
extend control
16
Instruction [15–0] Sign 32
extend ALU
control
16
Instruction [15–0] Sign 32
extend ALU
control
16
Instruction [15–0] Sign 32
extend ALU
control
Instruction [5–0]
8/19/2021 42
Performance Issues
▪ Longest delay determines clock period
▪ Critical path: load instruction
▪ Instruction memory → register file → ALU → data
memory → MUX.
▪ Not feasible to vary period for different
instructions
▪ Violates design principle
▪ Making the common case fast
▪ We will improve performance by pipelining
8/19/2021 Faculty of Computer Science and Engineering 43
Your turn
▪ What is critical path of following
instructions:
bne, sw
▪ Non-stop:
▪ Speedup
= 2n/(0.5n + 1.5) ≈ 4
= number of stages
8/19/2021 Faculty of Computer Science and Engineering 45
MIPS Pipeline
▪ Five stages, one step per stage
▪ IF: Instruction fetch from memory
▪ ID: Instruction decode & register read
▪ EX: Execute operation or calculate address
▪ MEM: Access memory operand
▪ WB: Write result back to register
Instruction Data
lw $2, 200($0) 800 ps fetch
Reg ALU
access
Reg
800 ps
Instruction Data
lw $2, 200($0) 200 ps Reg ALU Reg
fetch access
DM
Instruction 2 IM REG ALU REG
Instruction Data
beq $1, $2, 40 fetch
Reg ALU
access
Reg
200 ps
Program
execution
200 400 600 800 1000 1200 1400
order Time
(in instructions )
Instruction Data
add $4, $5, $6 fetch
Reg ALU
access
Reg
Prediction Instruction Data
beq $1, $2, 40 Reg ALU Reg
incorrect 200 ps fetch access
Lw IF ID E M WB
Sw IF ID E M WB
add IF ID M E WB
Single clock cycle: 3 cycles, cycle time = 5 secs
Lw IF ID E M WB
Sw IF ID E M
add IF ID E WB
Multi clock cycle: 5 + 4 + 4 = 13 cycles, cycle time = 1 secs
Lw IF ID E M WB
Sw IF ID E M WB
add IF ID E M WB
8/19/2021 Pipeline : 7 cycles, cycles time = 1 secs 68
Multiple clock cycle
Instruction #cycles
Load 5 IF ID EXE MEM WB
Store 4 IF ID EXE MEM
Branch 3 IF ID EXE
Arithmetic/logical 4 IF ID EXE WB
Jump 2 IF ID
Add
4 Add
ADD
result
Shift
left 2
0
Read Read
M register 1
Address data 1
u PC Zero
x Read ALU ALU
1 register 2 Address
Instruction result Read
Registers 0 1
data
MEM Write Read
M
u
Data M
Instruction register data 2 memory u
memory x x
Write 1
data 0
Write
Right-to-left data
flow leads to WB 16 32
Sign-
hazards extend
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Instruction fetch
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Instruction decode
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Execution
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Memory
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Write back
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
Wrong extend
register
number
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Execution
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Memory
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Write back
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
lw $13, 24($1)
IM REG ALU DM REG
Add
4 Add Add
result
Shift
left 2
Instruction
0
M
u PC Address Read
x register 1 Read
data 1
1
Read Zero
register 2 ALU ALU
Instruction Read
memory Read Address 0
Write data 2 0 result data
M
register M
Data u
Registers u
Write memory x
data x 1
1
Write
data
16 Sign- 32
extend
Add
Add
4 Shift result Branch
Left 2
Instruction
0 RegWrite
M
u
PC Address Read Read
x register 1 MemWrite
1 data 1 MemtoReg
Read ALUSrc Zero
register 2 Add ALU Read
Instruction Read 0 result Address data
1
Write data 2 M M
memory register u Data u
x
Write
data
Registers x
1 memory 0
Write
Instruction data
RegDst
Instruction
Control M WB
EX M WB
ID/EX
WB
EX/MEM
Control M WB
MEM/WB
EX M WB
IF/ID
Add Add
4 Add
Instruction
RegWrite
Shift result Branch
Left 2
ALUSrc
MemtoReg
MemWrite
0
M
u PC Read
x
Address register 1 Read
1
data 1
Read Zero
register 2 Add ALU Read 1
Instruction Write
Read
data 2
0
M
result Address data M
u
memory register u Data x
Write Registers x
data 1 memory 0
Write
Instruction data
[15–0] 16 Sign- 32 6 ALU
extend control
Instruction MemRead
[20–16] ALUOp
0
Instruction M
u
[15–11] x
1
RegDst
Registers
ALU
Data
memory M
u
x
a. No forwarding
M
u
x
Registers ForwardA
ALU
M Data
u M
x memory
u
x
ForwardB
Rs
Rt
Rt EX/MEM.RegisterRd
Rd
M
u
x
Forwarding
MEM/WB.RegisterRd
unit
b. With forwarding
8/19/2021 Faculty of Computer Science and Engineering 92
Forwarding Conditions
▪ EX hazard
▪ if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRs))
ForwardA = 10
▪ if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRt))
ForwardB = 10
IF/ID EX M WB
M
U
Instruction
X
Registers M
Add
Instruction U
PC
memory M X
Data
U
memory
X
IF/ID.RegisterRs Rs
IF/ID.RegisterRt Rt
IF/ID.RegisterRt Rt EX/MEM.RegisterRd
M
IF/ID.RegisterRd Rd
U
X
Forwarding MEM/WB.RegisterRd
unit
lw $2, 20($1)
IM REG ALU DM REG Need to stall
for one cycle
IM REG ALU DM REG
and $4, $2, $5
IF/ID.Write
ID/EXE
PCWrite
WB EXE/MEM
M
Control U M WB MEM/WB
IF/ID 0
X M WB
EX
M
U
Instruction
X
Registers
M
Instruction Add
U
PC M
memory X
U Data
X memory
IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt Rt
M
IF/ID.RegisterRd Rd
U
ID/EXE.RegisterRt X
Rs Forwarding
Rt
unit
PC
Hazard
detection
unit
ID/EXE
WB EXE/MEM
M
Control MEM/WB
28 U M WB
IF/ID
+ 72 X
EX M WB
44
48
+ Shift M
$4
4 Left 2 $1 U
X
M
Registers
= M
Instruction $3 ALU U
U M $8
72 PC 44 memory Data
X U X
7 memory
X
Sign-
extend
10
Forwarding
unit
Hazard
detection
unit
ID/EXE
WB EXE/MEM
M
Control MEM/WB
U M WB
IF/ID
+ X
EX M WB
72
76
+ Shift M
$1
4 Left 2 U
X
M
Registers
= M
Instruction ALU U
U M $3
76 PC 72 memory Data
X U X
memory
X
Sign-
extend
Forwarding
unit
… IF ID EX MEM WB
beq stalled IF ID
beq stalled IF ID
beq stalled ID
Not taken
Predict taken Predict taken
Taken
Not taken
Predict not taken Predict not taken
Taken
Not taken
8/19/2021 Faculty of Computer Science and Engineering 113
Calculating the Branch Target
▪ Even with predictor, still need to calculate
the target address
▪ 1-cycle penalty for a taken branch
▪ Branch target buffer
▪ Cache of target addresses
▪ Indexed by PC when instruction fetched
▪ If hit and instruction is branch predicted taken,
can fetch target immediately
ID.Flush
Hazard
IF.Flush
detection
unit
M
ID/EXE U
X EXE/MEM
WB 0
M M
Control U M MEM/WB
Cause U WB
IF/ID +
X X 1
0 EX EPC 0 M WB
+ Shift
M
4 Left 2 U
X
Registers = ALU M
M U
Instruction
U PC M X
80000180 memory Data
X U
memory
X
Sign-
extend
M
U
X
Forwarding
unit
12
Registers = ALU M
M $7 U
Instruction
U PC M X
80000180 54 memory $1 Data
X U
80000180 memory
X
Sign-
extend
13 12
M
15 $1 U
X
Forwarding
unit
13
Registers = ALU M
M U
Instruction
U PC M X
80000180 72 memory Data
X U
80000184 memory
X
Sign-
extend
13
M
U
X
Forwarding
unit
M
M Registers u
80000180 u Instruction x
PC
x memory
Write
data
Sign- ALU
Data
extend Sign-
memory
extend
Address
Hold pending
Reservation Reservation Reservation Reservation
station station ... station station operands