Professional Documents
Culture Documents
Address
Data
Control
CPU Memory
• CPU issues address (and data for write)
• Memory returns data for read
• Memory returns acknowledgment for write
These slides are based on the course taught by Dr. Somani at Iowa State University using
the textbook by Patterson and Hennessey titled “Computer Organization and Design.” It uses
a combination of slides provide with the teaching aids with the book and their variation
created by Dr. Somani over multiple years, before and after the adoption of the book.
2
MIPS Instruction Format
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
Instruction Execution
4
What Blocks We Need
• We need an ALU
– We have already designed that
• We need memory to store instruction and data
– Instruction memory takes address and supplies instruction
– Data memory takes address and supply data for lw
– Data memory takes address and data to write into memory
• We need to manage a PC and its update mechanism
• We need a register file to include 32 registers
– We read two operands, write a result back in register file
• Sometimes part of the operand comes from instruction
• We may add support of immediate class of instructions
• We may add support for J, JR, JAL
Simple Implementation
Instruction
address
MemWrite
PC
Instruction Add Sum
Instruction Address Read
memory data 16 32
Sign
extend
Write Data
data memory
a. Instruction memory b. Program counter c. Adder
ALU control
5 Read 3 MemRead
register 1
Read
Register 5 data 1 a. Data memory unit b. Sign-extension unit
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
data 2
Why do we need this stuff?
Write
Data data
RegWrite
a. Registers b. ALU
6
A Review of Combinational & Sequential Circuits
Two main classes of circuits:
1. Combinational circuits
• Circuits without memory
• Outputs depend only on current input values
2. Sequential Circuits (also called Finite State Machine)
• Circuits with memory
• Memory elements to store the state of the circuit
• The state represents the input sequence in the past
• Outputs depend on both circuit state and current inputs
B1
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
B2
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
B3
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4
24 23 22 - 13 12 - 3 2 1 0 Data out
20 Addr Bank Byte
Row Addresses Column Addresses
lines Addr Addr
8
A Register
• A memory element (flip-flop) is a one-bit storage
• A flip flop store a new bit data when a clock signal arrives.
• Clock may be a rising edge or a falling edge
• A register contains multiple flip flops, a 4-bit register is shown
• All clock signals are connected together to one clock
• Each flip-flops gets a different input
D0 D1 D2 D3
D Q D Q D Q D Q
Q0 Q1 Q2 Q3
C Q C Q C Q C Q
Clock
March 2021 Seminar series by Dr. Arun K. 9
Somani
A Shift Register
D0
D Q D Q D Q D Q
Q0 Q1 Q2 Q3
C Q C Q C Q C Q
Clock
March 2021 Seminar series by Dr. Arun K. 10
Somani
10
A Parallel-Access Register
• We add logic to a register to create different device behaviors
• A register or shift register holds data value for one clock period
• A parallel-access register can hold data values for longer
• Alternately, it can store new data if so needed
• A LD signal enables new data or hold old data
Clock
Q3 Q2 Q1 Q0
March 2021 Seminar series by Dr. Arun K. 11
Somani
11
Clock
March 2021 Seminar series by Dr. Arun K. 12
Somani
12
A Register File
• Register file is a unit containing r (4 to 32) registers
• Output ports are used for reading the register file RA1 RA2
– DATA1 and DATA2
– Any register can be read to any of the ports
– Each port use log2r bits to specify address DATA1
– RA1 and RA2 are read addresses Reg
LD_DATA File
13
LD LD LD
D3 D2 D1 D0 D3 D2 D1 D0 D3 D2 D1 D0
Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0
Clk Clk Clk
14
Reading Circuit
• A 3-bit register address, RA, specifies which register is to be read
• For each output port, we need one 8-to-1 4-bit multiplier
Register 111 001 000
Address
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
RA1 8-to-1 4-bit multiplex 8-to-1 4-bit multiplex RA2
DATA1 DATA2
15
LD_DATA
16
Arithmetic/Logic Unit
17
4-Bit Adder
18
4-Bit Carry-Look Ahead Adder
19
Operation
CarryIn a0 C a rry In
R e su lt 0
ALU 0
b0
a C a rry O u t
0
a1 C a rry In
1 R e su lt 1
ALU 1
Result b1
C a rry O u t
2 a2 C a rry In
b R e su lt 2
ALU 2
b2
C a rry O u t
CarryOut
a3 1 C a rry In
R e su lt 3 1
A LU 3 1
b3 1
March 2021 Seminar series by Dr. Arun K. 20
Somani
20
A 32-bit ALU
21
a0 CarryIn Result0
000 = and b0 ALU0
Less
001 = or CarryOut
010 = add
110 = subtract a1 CarryIn Result1
b1 ALU1
111 = slt 0 Less
Zero
CarryOut
a2 CarryIn Result2
b2 ALU2
0 Less
•Note: zero is a 1 CarryOut
Result31
a31 CarryIn
b31 ALU31 Set
0 Less Overflow
22
Ripple Carry Adder is Slow
23
c1 = g0 + p0c0
c2 = g1 + p1c1 c2 = g1+p1g0+p1p0c0
c3 = g2 + p2c2 c3 = g2+p2g1+p2p1g0+p2p1p0c0
c4 = g3 + p3c3 c4 = g3+p3g2+p3p2g1+p3p2p1g0+p3p2p1p0c0
24
A 4-bit CLA Adder
25
a0 CarryIn
• A 16-bit adder uses
b0 R esult0--3
a1
b1 ALU0
– Four 4-bit adders
a2 pi
b2 P0
G0 gi
a3
b3 C arry-loo kahead u nit
C1
ci + 1 • It takes block g and p terms
a4 CarryIn and cin to generate block
b4 R esult4--7
a5
b5 ALU1
carry bits out
a6 P1 pi + 1
b6 G1 gi + 1
a7
b7
C2
ci + 2 • Block carries are used to
a8
b8
CarryIn
R esult8--11
generate bit carries
a9
b9
a10
ALU2
pi + 2
– could use ripple carry of
P2
b10
a11
G2 gi + 2
4-bit CLA adders
b11
C3
ci + 3 – Better: use the CLA
a12
b12
CarryIn
R esult12--1 5
principle again!
a13
b13 ALU3
a14 P3 pi + 3
b14 G3 gi + 3
a15 C4
b15 ci + 4
CarryO ut
26
CLA Adders Delay
• 4-Bit case
– Generation of g and p: 1 gate delay
– Generation of carries (and G and P): 2 gate delay
– Generation of sum: 1 more gate delay
• 16-Bit case
– Generation of g and p: 1 gate delay
– Generation of block G and P: 2 more gate delay
– Generation of block carries: 2 more gate delay
– Generation of bit carries: 2 more gate delay
– Generation of sum: 1 more gate delay
• 64-Bit case
– 12 gate delays
27
28
MIPS Instruction Format
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
31 26 25 21 20 16 15 11 10 6 5 0
29
Simple Implementation
• Include the functional units we need for each instruction
Instruction
address
MemWrite
PC
Instruction Add Sum
Instruction Address Read
memory data 16 32
Sign
extend
Write Data
data memory
a. Instruction memory b. Program counter c. Adder
ALU control
5 Read 3 MemRead
register 1
Read
Register 5 data 1 a. Data memory unit b. Sign-extension unit
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
data 2
Why do we need this stuff?
Write
Data data
RegWrite
a. Registers b. ALU
30
Instruction Fetch
31
Load/Store Instructions
ALU control
5 Read 3
register 1
Read
Register 5 data 1
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
Write data 2
Data data
RegWrite
a. Registers b. ALU
32
R-Format Instructions
33
Branch Instructions
34
Simple Implementation: Connecting Elements
Data
Register #
PC Address Instruction Registers ALU Address
Instruction Register #
memory Data
Register # memory
Data
35
A Sketchy view
Next Sequential PC = PC + 4
Branch Target
= (PC+4)+offset
An instruction changes
1. PC (all instructions, branch and jump more complex)
2. Register (arithmetic/logic, load)
3. Memory (store)
36
Branch Instructions
37
38
Building the Datapath
PCSrc
M
Add u
x
4 Add ALU
result
Shift
left 2
Registers
Read 3 ALU operation
MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data
Write memory
RegWrite data
16 32
Sign
extend MemRead
39
1
Add M
u
x
4 ALU 0
Add result
RegWrite Shift
left 2
ALUOp
40
Control
41
42
Our Simple Control Structure
State State
element Combinational logic element
1 2
Clock cycle
43
44
Data Path Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
45
Control Points
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
46
LW Instruction Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
47
SW Instruction Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
48
R-Type Instruction Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
49
BR-Instruction Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
50
Jump Instruction Operation
4 A M
D U M
D ADD X U
Shift X
Left 2
AND
jmp
25-00
zero br
25-21 RA1
PC INST RD1
IA
MEMORY 20-16 RA2 REG
INST 31-00 FILE ALU MA
M DATA
WA WD RD2 M MEMORY
15-11 U U
X WE WD MD
X
MR MW M
RDES
ALU U
15-00 Sign ALU
SRC X
Ext CON
05-00 ALUOP Memreg
31-26
CONTROL
51
Control
op rs rt rd shamt funct
52
Adding Control to DataPath
0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [5– 0]
53
54
ALU Control
•
35 2 1 100
op rs rt 16 bit offset
000 AND
001 OR
010 add
110 subtract
111 set-on-less-than
55
R-Type Instruction
56
R-Type: Control Signals
57
Load Instruction
58
Load: Control Signals
RegDst 0
ALUSrc 1
MemtoReg 1
RegWrite 1
MemRead 1
MemWrite 0
Branch 0
ALUOp[1:0] 00
59
RegDst X
ALUSrc 1
MemtoReg X
RegWrite 0
MemRead 0
MemWrite 1
Branch 0
ALUOp[1:0] 00
60
Branch-on-Equal Instruction
61
RegDst X
ALUSrc 0
MemtoReg X
RegWrite 0
MemRead 0
MemWrite 0
Branch 1
ALUOp[1:0] 01
62
Other Control Information
63
Implementation of Control
Inputs
Op5
Op4
ALUOp Op3
Op2
ALUcontrol block
Op1
ALUOp0 Op0
ALUOp1
Outputs
ALUOpO
64
Timing: Single Cycle Implementation
1
Add M
u
x
4 ALU 0
Add result
RegWrite Shift
left 2
ALUOp
65
Implementing Jumps
Jump 2 address
31:26 25:
0
• Jump uses word address
• Update PC with concatenation of
– Top 4 bits of old PC
– 26-bit jump address
– 00
• Need an extra control signal decoded from opcode
66
Datapath With Jumps Added
67
68
Control Signals
Inst Reg- ALU- Mem- Reg- Mem Mem Bran ALU ALU Jum
Dst Src toReg Write Read Write ch Op1 Op0 p
R- 1 0 0 1 0 0 0 1 0 0
lw 0 1 1 1 1 0 0 0 0 0
sw X 1 X 0 0 1 0 0 0 0
beq X 0 X 0 0 0 1 0 1 0
j X X X 0 0 0 0 X X 1
69
ADDI
001000 rs rt immediate
31:26 25:21 20:16 15:0
R[rt] = R[rs]+SignExtImm
70
ADDI Changes
71
ADDI Datapath
72
ADDI Control Signals
Inst Reg- ALU- Mem- Reg- Mem- Mem- Branc ALUO ALUO
Dst Src toReg Write Read Write h p1 p0
R- 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
addi 0 1 0 1 0 0 0 0 0
73
SLL
74
SLL Data Path Changes
75
76
JAL
jal target
000011 address
31:26 25:0
PC = JumpAddr
R[31] = PC+4
77
78