You are on page 1of 39

The Processor Design

• The Processor Design and Main Memory System

Address
Data
Control

CPU Memory
• CPU issues address (and data for write)
• Memory returns data for read
• Memory returns acknowledgment for write

These slides are based on the course taught by Dr. Somani at Iowa State University using
the textbook by Patterson and Hennessey titled “Computer Organization and Design.” It uses
a combination of slides provide with the teaching aids with the book and their variation
created by Dr. Somani over multiple years, before and after the adoption of the book.

Datapath & Control Design

• We will design a simplified MIPS processor


• The instructions supported are
– memory-reference instructions: lw, sw
– arithmetic-logical instructions: add, sub, and, or, slt
– control flow instructions: beq, j
– We will add additional instruction: bne, addi, sll, jr, jal
• Generic Implementation:
– Use program counter (PC) to point to instruction address
– Get the instruction from memory
– Read registers
– Use the instruction to decide exactly what to do
• All instructions use the ALU after reading the registers
Consider lw and sw, arithmetic/logic, control flow

2
MIPS Instruction Format

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 LOAD ADDRESS OFFSET


LW

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 STORE ADDRESS OFFSET


SW

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 BRANCH ADDRESS OFFSET


BEQ/BNE

31 26 25 21 20 16 15 11 10 6 5 0

R-TYPE REG 1 REG 2 DST SHIFT AMOUNT ADD/AND/OR/SLT

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 IMMEDIATE DATA


I-TYPE

31 26 25 21 20 16 15 11 10 6 5 0

JUMP/JR/JAL JUMP ADDRESS

Instruction Execution

• Instruction read: PC  instruction memory, Fetch instruction


• Register read: Register numbers  register file, Read
registers

Then, depending on instruction class


• Execute: Use ALU to calculate
– Arithmetic result
– Memory address for load/store
– Branch target address

• Memory access: Access data memory for load/store


• Other Result writeback: Write data back to registers

PC update (for all): PC  target address or PC + 4

4
What Blocks We Need

• We need an ALU
– We have already designed that
• We need memory to store instruction and data
– Instruction memory takes address and supplies instruction
– Data memory takes address and supply data for lw
– Data memory takes address and data to write into memory
• We need to manage a PC and its update mechanism
• We need a register file to include 32 registers
– We read two operands, write a result back in register file
• Sometimes part of the operand comes from instruction
• We may add support of immediate class of instructions
• We may add support for J, JR, JAL

Simple Implementation

• Include the functional units we need for each instruction

Instruction
address
MemWrite
PC
Instruction Add Sum
Instruction Address Read
memory data 16 32
Sign
extend
Write Data
data memory
a. Instruction memory b. Program counter c. Adder

ALU control
5 Read 3 MemRead
register 1
Read
Register 5 data 1 a. Data memory unit b. Sign-extension unit
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
data 2
Why do we need this stuff?
Write
Data data

RegWrite

a. Registers b. ALU

6
A Review of Combinational & Sequential Circuits
Two main classes of circuits:
1. Combinational circuits
• Circuits without memory
• Outputs depend only on current input values
2. Sequential Circuits (also called Finite State Machine)
• Circuits with memory
• Memory elements to store the state of the circuit
• The state represents the input sequence in the past
• Outputs depend on both circuit state and current inputs

March 2021 Seminar series by Dr. Arun K. 7


Somani

4Mx64-bit Memory using 1Mx4 memory


Data in
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
B0
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

B1
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

B2
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

B3
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4

24 23 22 - 13 12 - 3 2 1 0 Data out
20 Addr Bank Byte
Row Addresses Column Addresses
lines Addr Addr

Decoder To all chips To all chips To select a


row addresses column addresses byte in 64 bit word
B3 B2 B1 B0

8
A Register
• A memory element (flip-flop) is a one-bit storage
• A flip flop store a new bit data when a clock signal arrives.
• Clock may be a rising edge or a falling edge
• A register contains multiple flip flops, a 4-bit register is shown
• All clock signals are connected together to one clock
• Each flip-flops gets a different input

D0 D1 D2 D3
D Q D Q D Q D Q
Q0 Q1 Q2 Q3
C Q C Q C Q C Q

Clock
March 2021 Seminar series by Dr. Arun K. 9
Somani

A Shift Register

• Flip-flops also can be connected to operate as a shift register


• All clock signals are connected together to one clock
• First flip flop gets a new input
• Others get input from previous flip-flop
• A 4-bit shift register is shown below

D0
D Q D Q D Q D Q
Q0 Q1 Q2 Q3

C Q C Q C Q C Q

Clock
March 2021 Seminar series by Dr. Arun K. 10
Somani

10
A Parallel-Access Register
• We add logic to a register to create different device behaviors
• A register or shift register holds data value for one clock period
• A parallel-access register can hold data values for longer
• Alternately, it can store new data if so needed
• A LD signal enables new data or hold old data

IN3 IN2 IN1 IN0


LD

Clock
Q3 Q2 Q1 Q0
March 2021 Seminar series by Dr. Arun K. 11
Somani

11

Implementation of A Parallel-Access Register


• At the input of D flip-flops, a MUX is used to select whether to load
a new input or to retain the old value IN
• All flip-flops get the same clock cycle
2-to-1
• signal LD = 1 means load new value LD D Q
MUX
• signal LD = 0 means retain old value
C P
Clock

IN3 IN2 IN1 IN0


1 0 1 0 1 0 1 0
LD MUX LD MUX LD MUX LD MUX
D3 D2 D1 D0
D Q D Q D Q D Q
Q3 Q2 Q1 Q0
C C C C

Clock
March 2021 Seminar series by Dr. Arun K. 12
Somani

12
A Register File
• Register file is a unit containing r (4 to 32) registers

• Each register has n (4 to 32) bits

• Output ports are used for reading the register file RA1 RA2
– DATA1 and DATA2
– Any register can be read to any of the ports
– Each port use log2r bits to specify address DATA1
– RA1 and RA2 are read addresses Reg
LD_DATA File

• Input port is used to write new data in register DATA2


– LD_DATA is new data
WR
– WA (log2r bits) specifies the write address WA
– Writing is enabled by WR signal

March 2021 Seminar series by Dr. Arun K. 13


Somani

13

A Register File Design Details

• We will design an eight-register file with 4-bit wide registers


• A single 4-bit register and its abstraction are shown below
LD
D3 D2 D1 D0
LD
D Q D Q D Q D Q D3 D2 D1 D0
Q3 Q2 Q1 Q0
C C C C Q3 Q2 Q1 Q0
Clock
Clock
• We need eight registers to make an eight-register file

LD LD LD
D3 D2 D1 D0 D3 D2 D1 D0 D3 D2 D1 D0
Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0
Clk Clk Clk

• How many bits are required to specify a register address?


March 2021 Seminar series by Dr. Arun K. 14
Somani

14
Reading Circuit
• A 3-bit register address, RA, specifies which register is to be read
• For each output port, we need one 8-to-1 4-bit multiplier
Register 111 001 000
Address

LD7 LD1 LD0


D3 D2 D1 D0 D3 D2 D1 D0 D3 D2 D1 D0
Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0
Clk Clk Clk

7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
RA1 8-to-1 4-bit multiplex 8-to-1 4-bit multiplex RA2

DATA1 DATA2

March 2021 Seminar series by Dr. Arun K. 15


Somani

15

Adding Write Control to Register File


• To write in a register, specify WA, WR signal, and new data
• A 3-bit write address is decoded if write register signal is present
• Only one of the eight registers gets a LD signal from decoder

LD_DATA

WA LD7 LD1 LD0


D3 D2 D1 D0 D3 D2 D1 D0 D3 D2 D1 D0
3 LD7 111 001 000
to Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0
8 Clk Clk Clk
D
e
c
o
d LD2
e LD1
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
r
LD0 RA1 8-to-1 4-bit multiplex 8-to-1 4-bit multiplex RA2
WR
DATA1 DATA2
March 2021 Seminar series by Dr. Arun K. 16
Somani

16
Arithmetic/Logic Unit

March 2021 Seminar series by Dr. Arun K. 17


Somani

17

4-Bit Adder

March 2021 Seminar series by Dr. Arun K. 18


Somani

18
4-Bit Carry-Look Ahead Adder

March 2021 Seminar series by Dr. Arun K. 19


Somani

19

Building A 32-bit ALU


C a rr y In O p e ra tio n

Operation
CarryIn a0 C a rry In
R e su lt 0
ALU 0
b0
a C a rry O u t
0

a1 C a rry In
1 R e su lt 1
ALU 1
Result b1
C a rry O u t

2 a2 C a rry In
b R e su lt 2
ALU 2
b2
C a rry O u t

CarryOut

a3 1 C a rry In
R e su lt 3 1
A LU 3 1
b3 1
March 2021 Seminar series by Dr. Arun K. 20
Somani

20
A 32-bit ALU

• A Ripple carry ALU


• Two bits decide operation
– Add/Sub
– AND
– OR
– LESS
• 1 bit decide add/sub operation
• A carry in bit
• Bit 31 generates overflow

March 2021 Seminar series by Dr. Arun K. 21


Somani

21

Test for equality

• Notice control lines:


Bnegate Operation

a0 CarryIn Result0
000 = and b0 ALU0
Less
001 = or CarryOut

010 = add
110 = subtract a1 CarryIn Result1
b1 ALU1
111 = slt 0 Less
Zero
CarryOut

a2 CarryIn Result2
b2 ALU2
0 Less
•Note: zero is a 1 CarryOut

•when the result is zero!

Result31
a31 CarryIn
b31 ALU31 Set
0 Less Overflow

March 2021 Seminar series by Dr. Arun K. 22


Somani

22
Ripple Carry Adder is Slow

• Is a 32-bit ALU as fast as a 1-bit ALU?


• Is there more than one way to do addition?
– two extremes: ripple carry and sum-of-products
• Can you see the ripple? How to get rid of it?

c1 = b0c0 + a0c0 + a0b0


c2 = b1c1 + a1c1 + a1b1 c2 =
c3 = b2c2 + a2c2 + a2b2 c3 =
c4 = b3c3 + a3c3 + a3b3 c4 =

Why we can not continue this way?

March 2021 Seminar series by Dr. Arun K. 23


Somani

23

Carry-look-ahead (CLA) adder

• This is an approach in-between the two extremes


• Motivation:
– We didn't know the value of carry-in.
– Need solution.
– When would we always generate a carry? gi = ai bi
– When would we propagate the carry? pi = ai + bi
• Now all caries equations depend on p, g, and c0

c1 = g0 + p0c0
c2 = g1 + p1c1 c2 = g1+p1g0+p1p0c0
c3 = g2 + p2c2 c3 = g2+p2g1+p2p1g0+p2p1p0c0
c4 = g3 + p3c3 c4 = g3+p3g2+p3p2g1+p3p2p1g0+p3p2p1p0c0

However, this will require bigger and bigger gates.


March 2021 Seminar series by Dr. Arun K. 24
Somani

24
A 4-bit CLA Adder

• Generate g and p term for each bit

• Use g’s, p’s and c0 to generate all C’s



• Use them to generate block G and P

• CLA principle can be used recursively

25

Build Bigger Adders


CarryIn

a0 CarryIn
• A 16-bit adder uses
b0 R esult0--3
a1
b1 ALU0
– Four 4-bit adders
a2 pi
b2 P0
G0 gi
a3
b3 C arry-loo kahead u nit
C1
ci + 1 • It takes block g and p terms
a4 CarryIn and cin to generate block
b4 R esult4--7
a5
b5 ALU1
carry bits out
a6 P1 pi + 1
b6 G1 gi + 1
a7
b7
C2
ci + 2 • Block carries are used to
a8
b8
CarryIn
R esult8--11
generate bit carries
a9
b9
a10
ALU2
pi + 2
– could use ripple carry of
P2
b10
a11
G2 gi + 2
4-bit CLA adders
b11
C3
ci + 3 – Better: use the CLA
a12
b12
CarryIn
R esult12--1 5
principle again!
a13
b13 ALU3
a14 P3 pi + 3
b14 G3 gi + 3
a15 C4
b15 ci + 4

CarryO ut

26
CLA Adders Delay

• 4-Bit case
– Generation of g and p: 1 gate delay
– Generation of carries (and G and P): 2 gate delay
– Generation of sum: 1 more gate delay

• 16-Bit case
– Generation of g and p: 1 gate delay
– Generation of block G and P: 2 more gate delay
– Generation of block carries: 2 more gate delay
– Generation of bit carries: 2 more gate delay
– Generation of sum: 1 more gate delay

• 64-Bit case
– 12 gate delays

27

Multiple Shift Types


MSB a[n- LSB a[0] Output out[n-1:0]
1]

a[n-2] shiftIn Logic shift left with shiftIn

a[n-2] 0 Logic shift left

a[n-2] a[n-1] Rotate left

a[n-2] Dv Divide shift left

a[n-1] shiftIn Arithmetic shift left with shiftIn

a[n-1] 0 Arithmetic shift left

a[n-1] a[0] No shift

a[n-1] a[1] Arithmetic shift right

shiftIn a[1] Logic shift right with shiftIn

0 a[1] Logic shift right

a[0] a[1] Rotate right

ms a[1] Multiply shift right

28
MIPS Instruction Format
31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 LOAD ADDRESS OFFSET


LW

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 STORE ADDRESS OFFSET


SW

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 BRANCH ADDRESS OFFSET


BEQ/BNE

31 26 25 21 20 16 15 11 10 6 5 0

R-TYPE REG 1 REG 2 DST SHIFT AMOUNT ADD/AND/OR/SLT

31 26 25 21 20 16 15 11 10 6 5 0

REG 1 REG 2 IMMEDIATE DATA


I-TYPE

31 26 25 21 20 16 15 11 10 6 5 0

JUMP/JR/JAL JUMP ADDRESS

March 2021 Seminar series by Dr. Arun K. 29


Somani

29

Simple Implementation
• Include the functional units we need for each instruction

Instruction
address
MemWrite
PC
Instruction Add Sum
Instruction Address Read
memory data 16 32
Sign
extend
Write Data
data memory
a. Instruction memory b. Program counter c. Adder

ALU control
5 Read 3 MemRead
register 1
Read
Register 5 data 1 a. Data memory unit b. Sign-extension unit
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
data 2
Why do we need this stuff?
Write
Data data

RegWrite

a. Registers b. ALU

March 2021 Seminar series by Dr. Arun K. 30


Somani

30
Instruction Fetch

PC register (32 bits), instruction memory, 32-bit adder (to increment PC by 4

31

Load/Store Instructions

• Read register operands


• Calculate address using 16-bit offset
– Use ALU, but sign-extend offset
• Load: Read memory and update register
• Store: Write register value to memory

ALU control
5 Read 3
register 1
Read
Register 5 data 1
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
Write data 2
Data data

RegWrite

a. Registers b. ALU

Data memory and sign extender

32
R-Format Instructions

• Read two register operands


• Perform arithmetic/logical operation
• Write register result

Register file and ALU

33

Branch Instructions

34
Simple Implementation: Connecting Elements

• Abstract / Simplified View:

Data

Register #
PC Address Instruction Registers ALU Address

Instruction Register #
memory Data
Register # memory

Data

• Two types of functional units:


– elements that operate on data values (combinational)
• Example: ALU
– elements that contain state (sequential)
• Examples: Program and Data memory, Register File

35

CPU Overview with PC logic

A Sketchy view
Next Sequential PC = PC + 4
Branch Target
= (PC+4)+offset

An instruction changes
1. PC (all instructions, branch and jump more complex)
2. Register (arithmetic/logic, load)
3. Memory (store)

36
Branch Instructions

• Read register operands


• Compare operands
– Use ALU, subtract and check Zero output
• Calculate target address
– Sign-extend displacement
– Shift left 2 places (word displacement)
– Add to PC + 4
• Already calculated by instruction fetch

37

Making Connection: Multiplexing


 Can’t just join wires together
 Use multiplexers

38
Building the Datapath

• Use multiplexors to stitch them together

PCSrc

M
Add u
x
4 Add ALU
result
Shift
left 2
Registers
Read 3 ALU operation
MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data
Write memory
RegWrite data
16 32
Sign
extend MemRead

39

A Complete Datapath for Basic Instructions

• Lw, Sw, Add, Sub, And, Or, Slt can be performed


• For j (jump) we need an additional multiplexor
PCSrc

1
Add M
u
x
4 ALU 0
Add result
RegWrite Shift
left 2

Instruction [25–21] Read


Read register 1 Read MemWrite
PC data 1
address Instruction [20–16] Read MemtoReg
ALUSrc
Instruction register 2 Zero
1 Read ALU ALU
[31–0] Write data 2 1 Read
M result Address 1
u register M data
Instruction Instruction [15–11] x u M
memory Write x u
0 data Registers 0
x
Write Data 0
RegDst data memory
Instruction [15–0] 16 Sign 32
extend ALU MemRead
control
Instruction [5–0]

ALUOp

40
Control

Control signals: mux


select, read/write enable,
ALU opcode, etc.

41

What Else is Needed in Data Path

• Support for j and jr


– For both of them PC value comes from somewhere else
– For J, PC is created by 4 bits (31:28) from old PC, 26 bits
from IR (27:2) and 2 bits are zero (1:0)
– For JR, PC value comes from a register
• Support for JAL
– Address is same as for J inst
– OLD PC needs to be saved in register 31
• And what about immediate operand instructions
– Second operand from instruction, but without shifting
• Support for other instructions like lw and immediate inst write

42
Our Simple Control Structure

• All of the logic is combinational


• We wait for everything to settle down, and the right thing to be done
– ALU might not produce “right answer” right away
– we use write signals along with clock to determine when to write
• Cycle time determined by length of the longest path

State State
element Combinational logic element
1 2

Clock cycle

We are ignoring some details like setup and hold times

43

Operation for Each Instruction

LW: SW: R/I/S-Type: BR-Type: JMP-Type:


1. READ INST 1. READ INST 1. READ INST 1. READ INST 1. READ
INST
2. READ REG 1 2. READ REG 1 2. READ REG 1 2. READ REG 1 2.
READ REG 2 READ REG 2 READ REG 2 READ REG 2
3. ADD REG 1 + 3. ADD REG 1 + 3. OPERATE on 3. SUB REG 2 3.
OFFSET OFFSET REG 1 / REG 2 from REG 1
4. READ MEM 4. WRITE MEM 4. 4.
4.

5. WRITE REG2 5. 5. WRITE DST 5.


5.

44
Data Path Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

45

Control Points

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

46
LW Instruction Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

47

SW Instruction Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

48
R-Type Instruction Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

49

BR-Instruction Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

50
Jump Instruction Operation

4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br

25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg

31-26
CONTROL

51

Control

• For each instruction


– Select the registers to be read (always read two)
– Select the 2nd ALU input
– Select the operation to be performed by ALU
– Select if data memory is to be read or written
– Select what is written and where in the register file
– Select what goes in PC
• Information comes from the 32 bits of the instruction
• Example:
add $8, $17, $18 Instruction Format:

000000 10001 10010 01000 00000 100000

op rs rt rd shamt funct

52
Adding Control to DataPath
0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite

Instruction [25– 21] Read


Read register 1
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

Memto- Reg Mem Mem


Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1

53

Summary of Control Signals

• RegDst: Write to register rt or rd?


• ALUSrc: Immediate to ALU?
• MemtoReg: Write memory or ALU output?
• RegWrite: Write to regfile at all?
• MemRead: Read from Data Memory?
• MemWrite: Write to the Data Memory?
• Branch: Is it a branch intruction?
• ALUOp[1:0]: ALU control field

54
ALU Control

• ALU's operation based on instruction type and function code


– e.g., what should the ALU do with any instruction
• Example: lw $1, 100($2)


35 2 1 100

op rs rt 16 bit offset

• ALU control input

000 AND
001 OR
010 add
110 subtract
111 set-on-less-than

• Why is the code for subtract 110 and not 011?

55

R-Type Instruction

56
R-Type: Control Signals

RegDst 1 (write to rd)


ALUSrc 0 (No immediate)
MemtoReg 0 (wrote not from memory)
RegWrite 1 (does write regfile)
MemRead 0 (no memory read)
MemWrite 0 (no memory write)
Branch 0 (does write regfile)
ALUOp[1:0] 10 (R-type ALU op)

57

Load Instruction

58
Load: Control Signals

RegDst 0
ALUSrc 1
MemtoReg 1
RegWrite 1
MemRead 1
MemWrite 0
Branch 0
ALUOp[1:0] 00

59

Store: Control Signals

RegDst X
ALUSrc 1
MemtoReg X
RegWrite 0
MemRead 0
MemWrite 1
Branch 0
ALUOp[1:0] 00

60
Branch-on-Equal Instruction

61

BEQ: Control Signals

RegDst X
ALUSrc 0
MemtoReg X
RegWrite 0
MemRead 0
MemWrite 0
Branch 1
ALUOp[1:0] 01

62
Other Control Information

• Must describe hardware to compute 3-bit ALU conrol input


– given instruction type
00 = lw, sw ALUOp
01 = beq, computed from instruction type
10 = arithmetic
11 = Jump
– function code for arithmetic
• Control can be described using a truth table:

ALUOp Funct field Operation


ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
X 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111

63

Implementation of Control

• Simple combinational logic to realize the truth tables

Inputs
Op5
Op4
ALUOp Op3
Op2
ALUcontrol block
Op1
ALUOp0 Op0

ALUOp1
Outputs

Operation2 R-format Iw sw beq


RegDst
F3
ALUSrc
Operation
F2 Operation1 MemtoReg
F(5–0)
RegWrite
F1
MemRead
Operation0
MemWrite
F0
Branch
ALUOp1

ALUOpO

64
Timing: Single Cycle Implementation

• Calculate cycle time assuming negligible delays except:


– memory (2ns), ALU and adders (2ns), register file access (1ns)
PCSrc

1
Add M
u
x
4 ALU 0
Add result
RegWrite Shift
left 2

Instruction [25– 21] Read


Read register 1 Read MemWrite
PC data 1
address Instruction [20– 16] Read MemtoReg
ALUSrc
Instruction register 2 Zero
1 Read ALU ALU
[31– 0] Write data 2 1 Read
M result Address 1
u register M data
Instruction Instruction [15– 11] x u M
memory Write x u
0 data Registers x
0
Write Data 0
RegDst data memory
Instruction [15– 0] 16 Sign 32
extend ALU MemRead
control
Instruction [5– 0]

ALUOp

65

Implementing Jumps

Jump 2 address
31:26 25:
0
• Jump uses word address
• Update PC with concatenation of
– Top 4 bits of old PC
– 26-bit jump address
– 00
• Need an extra control signal decoded from opcode

66
Datapath With Jumps Added

67

Extend Single-Cycle MIPS

Consider the following instructions


• bne: branch if not equal
• jr: Jump register
• addi: add immediate
• sll: Shift left logic by a constant
• jal: Jump and link

68
Control Signals

• What’re the control signal values for each instruction or type?

Inst Reg- ALU- Mem- Reg- Mem Mem Bran ALU ALU Jum
Dst Src toReg Write Read Write ch Op1 Op0 p
R- 1 0 0 1 0 0 0 1 0 0
lw 0 1 1 1 1 0 0 0 0 0
sw X 1 X 0 0 1 0 0 0 0
beq X 0 X 0 0 0 1 0 1 0
j X X X 0 0 0 0 X X 1

Note: “R-” means R-format

69

ADDI

addi rs, rt, immediate

001000 rs rt immediate
31:26 25:21 20:16 15:0
R[rt] = R[rs]+SignExtImm

• Read register operands (only one is used)


• Sign extend the immediate (in parallel)
• Perform arithmetic/logical operation
• Write register result

70
ADDI Changes

What changes to this


baseline?

71

ADDI Datapath

72
ADDI Control Signals

• Like LW  Like R-format arithmetic


– I-format instruction  Write ALU result to

– Write to register[rt] register file


– Use add operation

Inst Reg- ALU- Mem- Reg- Mem- Mem- Branc ALUO ALUO
Dst Src toReg Write Read Write h p1 p0
R- 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
addi 0 1 0 1 0 0 0 0 0

73

SLL

sll rd, rs, shamt

000000 rs rt rd shamt 000000


31:26 25:21 20:16 15:11 10:6 5:0
R[rd] = R[rt]<<shamt

• Read register operands (only one is used)


• Perform shift operation
• Write register result

Note: sllv rd, rt, rs for shift left logic variable

74
SLL Data Path Changes

75

SLL Data Path Changes

ALU needs to do shift

76
JAL

jal target

000011 address
31:26 25:0
PC = JumpAddr
R[31] = PC+4

• Jump uses word address


• Update PC with JumpAddr: concatenation of top 4 bits, old PC,
• 26-bit jump address, and 00 (called pseudo-direct)
• Save PC+4 to $ra

77

JAL Datapath Changes?

78

You might also like