Professional Documents
Culture Documents
CA - Multicycle Processor Design
CA - Multicycle Processor Design
PC
Instruction
op
Fetch
func
Register
Fetch
Main
ExtOp
Control
Ext ALUSrc
ALUctr
ALU
Zero
Mem
control
MemtoReg
Access
RegDst
Data Reg.
Mem Wrt RegWr
MemWr
Result Store
Abstract View of a Single Cycle Processor
2
Single Cycle Implementation
Calculate cycle time assuming negligible delays except:
memory (200ps),
ALU and adders (100ps),
register file access (50ps)
PCSrc
M
Add u
x
ALU
4 Add
result
Shift
left 2
16 32 MemRead
Sign
extend
wasteful of area
One Solution:
use a “smaller” cycle time
a “multicycle” datapath:
Instruction
register
PC Address Data
A
Instruction Register #
Registers ALU ALUOut
Memory or data
Register #
Memory B
Data data Register #
register
storage cell
address
bit line
address
decoder sense amps
mem. bus
proc. bus
L2 memory
Processor Cache
Cache
1 time-period
20 - 200 time-periods
2-8 time-periods
Acyclic
Combinational
Acyclic
Logic (A)
Combinational
Logic
storage element
Acyclic
Combinational
Logic (B)
storage element
storage element
MemRd
MemWr
MemtoRer
RegDst
nPC_sel
RegWr
ALUSrc
ALUctr
ExtOp
Reg.
Result Store
Instruction
Operand
File
Exec
Access
Next PC
Fetch
Fetch
Mem
PC
Mem
Data
Designing a multicycle processor 8
Partitioning the CPI=1 Datapath
Add registers between smallest steps
MemtoReg
Zero
nPC_sel
ALUSrc
MemWr
MemRd
RegWr
Reg. RegDst
ExtOp
ALUctr
Instruction
Operand
Next PC
Access
File
Fetch
Exec
Fetch
Mem
PC
Mem
Data
Place enables on all registers
PC 0
Instruction Read 0
M
u Address [25–21] register 1 M
x Read u
A x
1 Instruction data 1
Memory Read 1 Zero
[20–16] register 2
MemData 0 ALU ALU
Instruction M Registers ALUOut
[15–0] Instruction u Write result
Read
Write [15–11] x register B 0
data 2
data Instruction 1 4 1M
register Write u
0 data 2 x
Instruction M 3
[15–0] u
x
1
Memory 16 32
Sign Shift
data extend left 2
register
Instruction [5–0]
instruction?
We’ll use a finite state machine for control
A <= Reg[IR[25:21]]
B <= Reg[IR[20:16]]
ALUOut <= A op B
Break it down into steps following our rule that data flows through at
most one major functional unit (e.g., balance work across steps)
Write-back step
IR <= Memory[PC];
PC <= PC + 4;
A <= Reg[IR[25:21]];
B <= Reg[IR[20:16]];
ALUOut <= PC + (sign-extend(IR[15:0]) << 2);
ALUOut <= A op B;
Branch:
The write actually takes place at the end of the cycle on the edge
Write-back step 5
lw $t2, 0($t3)
lw $t3, 4($t3)
beq $t2, $t3, Label #assume not taken
add $t5, $t2, $t3
sw $t5, 8($t3)
Label: ...
use microprogramming
Next
state
Current state Next-state
function
Clock
Inputs
Output
Outputs
function
Instruction [5–0]
Note:
MemRead
0 1
ALUSrcA = 0
IorD = 0
IRWrite ALUSrcA = 0
Start ALUSrcB = 11
don’t care if not mentioned
ALUSrcB = 01
ALUOp = 00 ALUOp = 00
PCWrite
PCSource = 00
asserted if name only
otherwise exact value
Memory read
completon step
4
RegDst = 10
RegWrite
MemtoReg = 01
P C W rite C o n d
Io rD
M em R e ad
M e m W r ite
IR W rit e
C o n tro l l o g ic
M e m to R e g
P C S o u rce
ALUOp
O u tp u ts
A L U S rc B
A L U S rc A
R e g W rite
R egD st
NS3
NS2
NS1
I n p u ts
NS0
O p2
O p1
O p0
O p4
O p3
O p5
S2
S1
S0
S3
I n s tru c t io n re g is te r S ta te r e g is te r
o p c o d e fie ld
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
PCWrite
PCWriteCond
IorD
MemRead
MemWrite
IRWrite
MemtoReg
PCSource1
PCSource0
ALUOp1
ALUOp0
ALUSrcB1
ALUSrcB0
ALUSrcA
RegWrite
RegDst
NS3
NS2
NS1
NS0
our outputs are the bits of data that the address points to.
0 0 0 0 0 1 1
0 0 1 1 1 0 0
m n 0 1 0 1 1 0 0
0 1 1 1 0 0 0
1 0 0 0 0 0 0
1 0 1 0 0 0 1
1 1 0 0 1 1 0
1 1 1 0 1 1 1
PLA cells usually about the size of a ROM cell (slightly bigger)
O u tp u ts M e m to R e g
P C S o u rc e
A LU O p
A L U S rc B
A L U S rc A
R e g W r it e
R eg D st
In p u t A d d rC tl
S ta te
A d der
A d d r e s s s e le c t l o g ic
O p [5 – 0 ]
In s tr u c t io n r e g is te r
o p c o d e fie ld
101011 sw 0010
1
State
Adder
Mux AddrCtl
3 2 1 0
Control Biggest challenge for early
Instruction Control Lines Condition? computer designers was getting
control circuitry correct
Maurice Wilkes invented the
Datapath
Inst. Reg.
Registers
ALU
idea of microprogramming to
PC
design the control unit of a
processor for EDSAC‐II, 1958
Busy? Address Data
Main Memory
(holds fixed µcode
Decoder
instructions)
Control Lines
Datapath
Address Data
Main Memory
(holds user program written in macroinstructions, e.g., x86, RISC‐V)
Designing a multicycle processor 38
Microprogramming
Control unit PCWrite
PCWriteCond
IorD
Microcode memory MemRead Datapath
MemWrite
IRWrite
BWrite
Outputs MemtoReg
PCSource
ALUOp
ALUSrcB
ALUSrcA
RegWrite
RegDst
AddrCtl
Input
1
Microprogram counter
Adder
Instruction register
opcode field
Control
Control I/O
interface
Instruction cache
Data
cache
Enhanced
floating point
and multimedia Integer
datapath
Secondary
cache
and
memory
Control interface
IBM PowerPC, …
Most instructions executed directly, i.e., with hard-
wired control
Infrequently-used and/or complicated instructions
invoke microcode
Exploits recurring control PC (state)
µcode
signal patterns in µcode, next‐state
e.g., µaddress
µcode ROM
ALU0 A Reg[rs1]
... nanoaddress
ALUI0 A Reg[rs1]
... nanoinstruction ROM
data
Motorola 68000 had 17-bit µcode containing either 10-bit µjump or 9-bit
nanoinstruction pointer
Nanoinstructions were 68 bits wide, decoded to give 196 control
signals
Control I/O
interface
Instruction cache
Data
cache
Enhanced
floating point
and multimedia Integer
datapath
Secondary
cache
and
memory
Control interface
Advanced pipelining
Control
hyperthreading support
Arvind (MIT)