Professional Documents
Culture Documents
Single-Cycle Datapath
Review
• Construction of the Datapath
– Instruction-specific building blocks (R, I, J formats)
– Modular design
• ALU, Register File, Data Memory
• ALU or adder for computing branch target address (BTA)
– Instruction-specific connection of datapath components
• Instruction Formats and the Datapath
– R: ALU operation,
– I: Load/store - Data I/O from register file/memory
– I: Conditional branch – Eval. condition, Compute BTA
– J: Jump (unconditional branch) – Compute JTA
Overview of Today’s Lecture
• Can we make a datapath operate in one cycle?
– All instructions executed in CPI = 1
– Increases efficiency of software
• Composition of simple datapath components
• Build up the datapath iteratively
– R-format instruction
– I-format
– J-format
• Problems with the single-cycle assumption
Processor Performance
CPU time = IC * CPI * Cycle time
Program
Compiler
ISA
Microarchitecture
Hardware
Implementation Review
Instruction rd Data
memory
rs Address
PC
+4 Data
imm
Opcode,
funct
Controller
Write ALU
Register Read
Write Data Data 2
Result
Register
Write
Component: Load/Store Datapath
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
Animating the Datapath:
Load Instruction
Instruction lw rt,offset(rs)
32 16 5 5 5 Operation
3
RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
Animating the Datapath:
Store Instruction
Instruction sw rt,offset(rs)
32 16 5 5 5 Operation
3
RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
MIPS Datapath II: Single-Cycle
Separate adder as ALU operations and PC
increment occur in the same clock cycle
Add
Read Registers
ALU operation
register 1 3 MemWrite
PC Read
Read Read MemtoReg
address
register 2 data 1 ALUSrc Zero
Instruction ALU ALU
Write Read Address Read
register data 2 M result data
u M
Instruction Write x u
memory Data x
data memory
Write
RegWrite data
16 Sign 32 MemRead
extend
Separate instruction memory
as instruction and data read
occur in the same clock cycle
Adding instruction fetch
Branch Datapath Actions
Instruction: beq $t1, $t2, offset
1. Fetch instruction and increment PC
2. Read registers (e.g., $t1 and $t2) from the
register file from Register File
3. ALU subtracts $t1 - $t2. Adder sums PC + 4
plus sign-extended lower 16 bits of offset
shifted left two bits => branch target address
4. ALU’s Zero output directs PC+4 or BTA to be
written as new PC
R-format + Load/Store + Branch DP
M
Add u
x
4 Add ALU
result
Shift
left 2 Extra adder needed as both
adders operate in each cycle
Registers
Read 3 ALU operation
MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data memory
Write
RegWrite data
16 32
Sign
extend MemRead
Instruction address is either
PC+4 or branch target address
PC <<2 PCSrc
Instruction
ADDR RD
32 16 5 5 5 Operation
Instruction 3
Memory RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
PC <<2 PCSrc
Instruction
ADDR RD
32 16 5 5 5 Operation
Instruction 3
Memory RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
lw rt,offset(rs)
N MemRead
D
Datapath Executing sw
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
32 16 5 5 5 Operation
Instruction 3
Memory RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
sw rt,offset(rs)
N MemRead
D
Datapath Executing beq
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
32 16 5 5 5 Operation
Instruction 3
Memory RN1 RN2 WN
RD1
Register File ALU Zero
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
beq r1,r2,offset
N MemRead
D
Control Overview
• Single-cycle implementation
– Datapath: combinational logic, I-mem, regs, D-mem, PC
• Last three written at end of cycle
– Need control – just combinational logic!
– Inputs:
• Instruction (I-mem out)
• Zero (for beq)
– Outputs:
• Control lines for muxes
• ALUop
• Write-enables
Control Overview
• Fast control
– Divide up work on “need to know” basis
– Logic with fewer inputs is faster
• E.g.
– Global control need not know which ALUop
ALU Control
• Assume the control line values in table
ALU Control
• Plan to control ALU: main control sends a 2-bit ALUOp control field to the ALU control. Based on
ALUOp and funct field of instruction the ALU control generates the 3-bit ALU control field
• ALU-ctrl = f(opcode,function)
But…don’t forget
Instruction Operation Opcode function
lw add 100011 xxxxxx
sw add 101011 xxxxxx
beq sub 000100 100010
• To simplify ALU-ctrl
– ALUop = f(opcode)
2 bits 6 bits
ALU Control
Load/store
opcode rs rt address
or branch
31-26 25-21 20-16 15-0
1
Add M
u
x
4 ALU 0
Add result
New multiplexor RegWrite Shift
left 2
ALUOp
Adding control to the MIPS Datapath III (and a new multiplexor to select field to
specify destination register): what are the functions of the 9 control signals?
Control Signals
Signal Name Effect when deasserted Effect when asserted
RegDst The register destination number for the The register destination number for the
Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11)
RegWrite None The register on the Write register input is written
with the value on the Write data input
AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended,
second register file output (Read data 2) lower 16 bits of the instruction
PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder
that computes the value of PC + 4 that computes the branch target
MemRead None Data memory contents designated by the address
input are put on the first Read data output
MemWrite None Data memory contents designated by the address
input are replaced by the value of the Write data inpu
MemtoReg The value fed to the register Write data input The value fed to the register Write data input
comes from the ALU comes from the data memory
Instruction [5 0]
MIPS datapath with the control unit: input to control is the 6-bit instruction
opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal
Global Control
Instruction Opcode RegDst ALUSrc
rrr 000000 1 0
lw 100011 0 1
sw 101011 x 1
beq 000100 x 0
??? others x x
• RegDst = ~Op[0]
• ALUSrc = Op[0]
• RegWrite = ~Op[3] * ~Op[2]
Datapath Control (Finalized)
PCSrc cannot be
0
M
set directly from the
u
x
opcode: zero test
ALU
Add result 1
outcome is required
Add Shift PCSrc
RegDst left 2
4 Branch
MemRead
Instruction [31 26] MemtoReg
Control
ALUOp
MemWrite
ALUSrc
RegWrite
ALU
control
Determining control signals for the MIPS datapath based on instruction opcode
Memto- Reg Mem Mem
Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
Control Signals:
R-Type Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
1
U
16 X 32 ALUSrc X
T WD
Control signals
N
D
0 MemRead 0
shown in blue 0
Control Signals:
lw Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
Register File
RD1
ALU Zero 0
WD 0
immediate/
offset M MemWrite 1
MemtoReg
RD2 U ADDR
I[15:0] RegWrite X
1
1
Data
E Memory RD M
1
U
16 X 32 ALUSrc X
T WD
Control signals
N
D
1 MemRead 0
shown in blue 1
Control Signals:
sw Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
MUX
1 01
RegDst
Memory 16 5 5 5
X 03
Operation
RN1 RN2 WN
Register File
RD1
ALU Zero 1
WD 0
immediate/
offset M MemWrite X
MemtoReg
RD2 U ADDR
I[15:0] RegWrite X 1
1
Data
E Memory RD M
0
U
16 X 32 ALUSrc X
T WD
Control signals
N
D 1 MemRead 0
shown in blue 0
Control Signals:
beq Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
Register File
RD1
ALU Zero 0
WD 0
immediate/
offset M MemWrite X
MemtoReg
RD2 U ADDR
I[15:0] RegWrite X
1
1
Data
E Memory RD M
0
U
16 X 32 ALUSrc X
T WD
Control signals
N
D
0 MemRead 0
shown in blue 0
Global Control
• More complex with entire MIPS ISA
– Need more systematic structure
– Want to share gates between control signals
• Common solution: PLA
– MIPS opcode space designed to minimize PLA
inputs, minterms, and outputs
• Refer to MIPS Opcode map
PLA
• In AND-plane, &
selected inputs to get
minterms
• In OR-plane, | selected
minterms to get outputs
• E.g.
Datapath Extension: Jump Instr.
Instruction: j address
1. Fetch instruction and increment PC
2. Read address from immediate field of instr.
3. Jump target address (JTA) has these bits:
• Bits 31-28: Upper four bits of PC+4
• Bits 27-02: Immediate field of Jump instr.
• Bits 01-00: Zero (002)
4. Mux controlled by Jump Control Bit selects
JTA or branch target address as new PC
Control Signals; Add Jumps
Datapath Extension: Jump Instr.
Instruction [5– 0]
MIPS datapath extended to jumps: control unit generates new Jump control bit
Datapath Extension: Jump Instr.
Datapath Executing j
28 32
jmpaddr I[25:0]
<<2 CONCAT
1
ADD 26
PC+4[31-28] 0 M
U
M
ADD X
ALUOp
ADD U
4 Control ALU X 0
Unit 2 Control 1
PC <<2
funct
PCSrc Jump
op 6 I[31:26] 6 I[5:0]
ADDR RD 5
32 Instruction I 0 1
Instruction MUX
Memory RegDst Operation
16 5 5 5 Branch
3
RN1 RN2 WN
op I[31: RD1 Zero
Register File ALU
WD 0
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X 1
1
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
N MemRead 0
D
Single-cycle Implementation
Notes
• The steps are not really distinct as each instruction completes in
exactly one clock cycle – they simply indicate the sequence of
data flowing through the datapath
• The operation of the datapath during a cycle is purely
combinational – nothing is stored during a clock cycle
• Therefore, the machine is stable in a particular state at the start of
a cycle and reaches a new stable state only at the end of the cycle
• Very important for understanding single-cycle computing:
Load Instruction Steps
lw $t1, offset($t2)
1. Fetch instruction and increment PC
2. Read base register from the register file: the base register ($t2)
is given by bits 25-21 of the instruction
3. ALU computes sum of value read from the register file and
the sign-extended lower 16 bits (offset) of the instruction
4. The sum from the ALU is used as the address for the data
memory
5. The data from the memory unit is written into the register file:
the destination register ($t1) is given by bits 20-16 of the
instruction
Branch Instruction Steps
beq $t1, $t2, offset
1. Fetch instruction and increment PC
2. Read two register ($t1 and $t2) from the register file
3. ALU performs a subtract on the data values from the
register file; the value of PC+4 is added to the sign-
extended lower 16 bits (offset) of the instruction shifted left
by two to give the branch target address
4. The Zero result from the ALU is used to decide which
adder result (from step 1 or 3) to store in the PC
Implementation: ALU Control Block
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0* 1 X X X X X X 110 *Typo in text
1 X X X 0 0 0 0 010 Fig. 5.15: if it is X
1 X X X 0 0 1 0 110 then there is potential
conflict between
1 X X X 0 1 0 0 000 line 2 and lines 3-7!
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
Truth table for ALU control bits
ALUOp
ALU control block
ALUOp0
ALUOp1
Operation2
F3
Operation
F2 Operation1
F (5– 0)
F1
Operation0
F0
Op3 0 0 1 0 Outputs
Op2 0 0 0 1 R-format Iw sw beq
RegDst
Op1 0 1 1 0 ALUSrc
Op0 0 1 1 0 MemtoReg
RegDst 1 0 x x RegWrite
ALUSrc 0 1 1 0 MemRead
MemtoReg 0 1 x x MemWrite
Outputs
RegWrite 1 1 0 0 Branch
MemRead 0 1 0 0 ALUOp1
MemWrite 0 0 1 0 ALUOpO