Professional Documents
Culture Documents
[Adapted from Computer Organization and Design, RISC-V Edition, Patterson & Hennessy, © 2018, MK]
[Adapted from Great ideas in Computer Architecture (CS 61C) lecture slides, Garcia and Nikolíc, © 2020, UC Berkeley]
3/22/2021 1
RISC-V Processor Design
3/22/2021 2
Machine Structures
Application (ex: browser)
CS61C
Operating
System
Compiler
(Mac OSX)
Software Assembler Instruction Set
Architecture
Hardware Processor Memory I/O system
3/22/2021 3
New-School Machine Structures
Software Harness Hardware
Parallelism &
Parallel Requests Achieve High
Assigned to computer Performance
e.g., Search “Cats” Smart
Warehouse Phone
Scale
Parallel Threads Computer
Assigned to core e.g., Lookup, Ads Computer
Core Core
Parallel Instructions Memory
>1 instruction @ one time …(Cache)
e.g., 5 pipelined instructions Input/Output
Parallel Data Exec. Unit(s) Functional
>1 data item @ one time Block(s)
e.g., Add of 4 pairs of words A0+B0 A1+B1
Main Memory
Hardware descriptions
All gates work in parallel at same time Logic Gates A
B
Out = AB+CD
C
3/22/2021 4
Great Idea #1: Abstraction
(Levels of Representation/Interpretation)
High Level Language temp = v[k];
Program (e.g., C) v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw x3, 0(x10)
Assembly Language lw x4, 4(x10)
Program (e.g., RISC-V) sw x4, 0(x10)
sw x3, 4(x10)
Assembler 1000 11 01 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
Program (RISC-V) 1010 1110 0001 0010 0000 0000 0000 0000
1010 1101 1110 0010 0000 0000 0000 0100
pc+4
1
0
pc IMEM
wb DataD
inst[11:7] AddrD
Branch
Reg[rs1]
Reg[rs2]
0
1
0
ALU DMEM
Addr
DataR
2
1
0
wb
inst[19:15] AddrA DataA
Comp. DataW mem
Program
Address
Da tap ath
Program Counter (PC)
Bytes
Registers
Write Da ta
Data
Read Data
Arithmetic-Log ic Output
Unit (ALU)
3/22/2021 6
The CPU
• Processor (CPU): the active part of the computer that does all
the work (data manipulation and decision-making)
• Datapath: portion of the processor that contains hardware
necessary to perform operations required by the processor
(the brawn)
• Control: portion of the processor (also in hardware) that tells
the datapath what needs to be done (the brain)
3/22/2021 7
Need to Implement All RV32I Instructions
Open Reference Card
Base Integer Instructions: RV32I
Category Name Fmt RV32I Base Category Name Fmt RV32I Base
Shifts Shift Left Logical R SLL rd,rs1,rs2 Loads Load Byte I LB rd,rs1,imm
Shift Left Log. Imm. I SLLI rd,rs1,shamt Load Halfword I LH rd,rs1,imm
Shift Right Logical R SRL rd,rs1,rs2 Load Byte Unsigned I LBU rd,rs1,imm
Shift Right Log. Imm. I SRLI rd,rs1,shamt Load Half Unsigned I LHU rd,rs1,imm
Shift Right Arithmetic R SRA rd,rs1,rs2 Load Word I LW rd,rs1,imm
Shift Right Arith. Imm. I SRAI rd,rs1,shamt Stores Store Byte S SB rs1,rs2,imm
3/22/2021 10
Stages of the Datapath : Overview
• Problem: a single, “monolithic” block that “executes an
instruction” (performs all necessary operations beginning
with fetching the instruction) would be too bulky and
inefficient
• Solution: break up the process of “executing an instruction”
into stages, and then connect the stages to create the whole
datapath
– smaller stages are easier to design
– easy to optimize (change) one stage without touching the
others (modularity)
3/22/2021 11
Five Stages of the Datapath
• Stage 1: Instruction Fetch (IF)
• Stage 2: Instruction Decode (ID)
3/22/2021 12
Basic Phases of Instruction Execution
rd
Reg[ ]
PC
rs1
IMEM
DMEM
rs2 ALU
+4 imm
mux
Sum 32
MUX
ALU
Y Result
32 32
32
B CarryOut B B
32 32 32
3/22/2021 14
Datapath Elements: State and Sequencing (1/3)
Write Enable
3/22/2021 15
Datapath Elements: State and Sequencing (2/3)
RWRA RB
▪ Register file (regfile, RF) consists of 32 registers: Write Enable 5 5 5
Two 32-bit output busses: busA and busB busA
One 32-bit input bus: busW busW 32 x 32-bit 32
32 Registers busB
▪ Register is selected by: Clk 32
RA (number) selects the register to put on busA (data)
RB (number) selects the register to put on busB (data)
RW (number) selects the register to be written
via busW (data) when Write Enable is 1
▪ Clock input (Clk)
Clk input is a factor ONLY during write operation
During read operation, behaves as a combinational
logic block:
RA or RB valid busA or busB valid after “access time.”
3/22/2021 16
Datapath Elements: State and Sequencing (3/3)
▪ “Magic” Memory Write Enable Address
One input bus: Data In
Data In DataOut
One output bus: Data Out 32 32
▪ Memory word is found by: Clk
3/22/2021 17
State Required by RV32I ISA (1/2)
Each instruction during execution reads and updates the state of :
(1) Registers, (2) Program counter, (3) Memory
▪ Registers (x0..x31)
Register file (regfile) Reg holds 32 registers x 32 bits/register:
Reg[0]..Reg[31]
First register read specified by rs1 field in instruction
Second register read specified by rs2 field in instruction
Write register (destination) specified by rd field in instruction
x0 is always 0 (writes to Reg[0]are ignored)
▪ Program Counter (PC)
Holds address of current instruction
3/22/2021 18
State Required by RV32I ISA (2/2)
▪ Memory (MEM)
Holds both instructions & data, in one 32-bit byte-addressed
memory space
We’ll use separate memories for instructions (IMEM) and data
(DMEM)
These are placeholders for instruction and data caches
Instructions are read (fetched) from instruction memory
(assume IMEM read-only)
Load/store instructions access data memory
3/22/2021 19
3/22/2021 20
Review: R-Type Instructions
3
1 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 10 9 8 7 6 5 4 3 2 1 0
R-format : ALU
[31:25] [24:20] [19:15] [14:12] [11:7] [6:0]
7 5 5 3 5 7
func7 rs2 rs1 func3 rd opcode
0000000 rs2 rs1 000 : ADD rd 0110011:OP-R
0100000 rs2 rs1 000 : SUB rd 0110011:OP-R
0000000 rs2 rs1 001 : SLL rd 0110011:OP-R
0000000 rs2 rs1 010 : SLT rd 0110011:OP-R
0000000 rs2 rs1 011 : SLTU rd 0110011:OP-R
0000000 rs2 rs1 100 : XOR rd 0110011:OP-R
0000000 rs2 rs1 101 : SRL rd 0110011:OP-R
0100000 rs2 rs1 101 : SRA rd 0110011:OP-R
0000000 rs2 rs1 110 : OR rd 0110011:OP-R
0000000 rs2 rs1 111 : AND rd 0110011:OP-R
▪ E.g. Addition/subtraction add rd, rs1, rs2
R[rd] = R[rs1] + R[rs2]
sub rd, rs1, rs2
R[rd] = R[rs1] - R[rs2]
3/22/2021 21
Implementing the add instruction
31 25 24 20 19 1514 12 11 76 0
funct7 rs2 rs1 funct3 rd opcode
7 5 5 3 5 7
31 25 24 20 19 1514 12 11 76 0
0000000 rs2 rs1 000 rd 0110011
7 5 5 3 5 7
add rs2 rs1 add rd Reg-Reg OP
DataD
Inst[11:7]
PC AddrD
addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
+
Inst[24:20] Reg[rs2]
AddrB DataB ALU
clk
IMEM Reg [ ]
Inst[31:0] clk
RegWriteEnable (RegWEn) =1
Control logic
1514 12 11 0
31 25 24 20 19 76
funct7 rs2 rs1 funct3 rd opcode
7 5 5 3 5 7
3/22/2021 23
Timing Diagram for add +4
Add
DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
clk
AddrB DataB ALU
IMEM Reg [ ]
RegWEn
Inst[31:0] clk
Clock
PC 1000 1004
3/22/2021 time 24
3/22/2021 25
Implementing the sub instruction
0000000 rs2 rs1 000 rd 0110011 add
0100000 rs2 rs1 000 rd 0110011 sub
3/22/2021 26
Datapath for add/sub
PC = PC + 4
Reg[rd] = Reg[rs1] +/- Reg[rs2]
+4
Add
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2] ALU
AddrB DataB
clk
IMEM Reg [ ]
Inst[31:0] clk
3/22/2021 30
Datapath for add/sub
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
Inst[24:20] ALU
AddrB Reg[rs2]
clk DataB
IMEM Reg [ ]
Inst[31:0] clk
Immediate should
3/22/2021
be here 31
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM Reg [ ] 1
clk Imm[31:0]
Inst[31:0]
BSel ALUSel
Reg WriteEnable
(rs2=0/ (add=0/ sub=1)
(RegWEn)=1
Control logic Imm=1)
31 20 19 1514 12 11 76 0
imm[11:0] rs1 000 rd 0010011
12 5 3 5 7
3/22/2021 32
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM Reg [ ] 1
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
3/22/2021 33
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB DataB
ALU
clk 0
IMEM Reg [ ] 1
clk
Inst Bsel=1
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
3/22/2021 34
I-Format Immediates
-inst[31]-
31 30 20 19 1514 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 inst[31:0]
--inst[31]-(sign-extension)-- inst[30:20]
imm[31:0]
• High 12 bits of instruction (inst[31:20])
inst[31:20] imm[31:0] copied to low 12 bits of immediate
Imm.
Gen (imm[11:0])
• Immediate is sign-extended by copying
ImmSel=I value of inst[31] to fill the upper 20 bits
of the immediate value (imm[31:12])
3/22/2021 35
Adding addi to Datapath
Works for all other I-format arithmetic
instructions (slti,sltiu,andi,
+4 ori,xori,slli,srli,
Add srai) just by changing ALUSel
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB DataB
ALU
clk 0
IMEM Reg [ ] 1
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
3/22/2021 36
3/22/2021 37
R+IArithmetic/LogicDatapath
+4
Add
DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM
Reg [ ] 1
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
Control logic
3/22/2021 38
RISC-V (37)
Add lw
▪ RISC-V Assembly Instruction (I-type): lw x14, 8(x2)
31 20 19 1514 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 5 3 5 7
offset[11:0] base width dest LOAD
000000001000 00010 010 01110 0000011
+4
Add
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 inst AddrA DataA DataR
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM
Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
Control logic
3/22/2021 40
RISC-V (39)
R+IArithmetic/LogicDatapath
+4
Add
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15]
0
pc+4 inst AddrA DataA DataR
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM
Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
3/22/2021 41
RISC-V (40)
AllRV32 Load Instructions
imm[11:0] rs1 000 rd 0000011 lb
imm[11:0] rs1 001 rd 0000011 lh
imm[11:0] rs1 010 rd 0000011 lw
imm[11:0] rs1 100 rd 0000011 lbu
imm[11:0] rs1 101 rd 0000011 lhu
3/22/2021 42
RISC-V (41)
3/22/2021 43
Adding sw instruction
sw: Reads two registers, rs1 for base memory address,
and rs2 for data to be stored, as well immediate offset!
sw x14, 8(x2)
31 25 24 20 19 1514 12 11 76 0
Imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
7 5 5 3 5 7
offset[11:5] src base width offset[4:0] STORE
offset[11:5] offset[4:0]
=0 rs2=14 rs1=2 SW =8 STORE
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR
mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Control logic
3/22/2021 45
RISC-V (44)
Adding sw to Datapath
+4
Add
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR
mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
RISC-V (46)
3/22/2021
Adding sw to Datapath
+4
Add
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15]
0
pc+4 AddrA DataA DataB
mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]
3/22/2021 47
RISC-V (46)
I+S ImmediateGeneration
31 30 25 24 20 19 15 14 12 11 7 6 0
imm[11:0] rs1 funct3 rd I-opcode I
S
imm[11:5] rs2 rs1 funct3 imm[4:0] S-opcode
5 5 inst[31:0]
1 6
I/S I S
3/22/2021 49
RISC-V (48)
3/22/2021 50
RISC-V B-Formatfor Branches
31 30 25 24 2019 15 14 12 11 8 7 6 0
imm[12] imm[10:5] rs2 rs1 funct3 imm[4:1] imm[11] opcode
1 6 5 5 3 4 1 7
offset[12|11:5] rs1 funct3 BRANCH
rs2 offset[4:1|11]
▪ B-format is mostly same as S-Format, with two register
sources (rs1/rs2) and a 12-bit immediate imm[12:1]
▪ But now immediate represents values
-4096 to +4094 in 2-byte increments
▪ The 12 immediate bits encode even 13-bit signed byte offsets
(lowest bit of offset is always zero, so no need to store it)
3/22/2021
Datapath So Far
+4
Add
DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR
mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
RISC-V (52)
Control logic
3/22/2021
ToAddBranches
▪ Different change to the state:
PC = PC + 4, branch not taken
PC + immediate, branch taken
▪ Six branch instructions: beq, bne, blt, bge,
bltu, bgeu
▪ Need to compute PC + immediate and to
compare values of rs1 and rs2
But have only one ALU – need more hardware
3/22/2021
Adding Branches
+4
Add
pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
RISC-V (54)
PCSel= Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
taken/not taken =B =0 BrEq =1 =1 =add =read =*
Control logic
3/22/2021
Branch Comparator
•BrEq = 1, if A=B
A Branch
B Comp •BrLT = 1, if A<B
A<B = !(A<B)
3/22/2021
RISC-V (55)
Branch Immediates (In OtherISAs)
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in
multiples of 2 bytes
• Standard approach: Treat immediate as in range -2048..+2047, then shift
left by 1 bit to multiply by 2 for branches
3/22/2021
Branch Immediates (In OtherISAs)
• 12-bit immediate encodes PC-relative offset of -4096 to +4094
bytes in multiples of 2 bytes
3/22/2021
RISC-V ImmediateEncoding
Instruction encodings, inst[31:0]
31 30 25 24 20 19 15 14 12 11 8 76 0
imm[11:0] rs1 funct3 rd opcode I-type
imm[11:5] rs2 rs1 funct3 imm[4:0] opcode S-type
imm[12|10:5] rs2 rs1 funct3 imm[4:1|11] opcode B-type
pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel= Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
taken/not taken =B =0 BrEq =1 =1 =add =read =*
Control logic
3/22/2021 59
RISC-V (58)
3/22/2021 60
Let’s Add JALR (I-Format)
31 20 19 15 14 12 11 76 0
imm[11:0] rs1 func3 rd opcode
12 5 3 5 7
offset[11:0] base 0 dest JALR
+4
Add
pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
3/22/2021 62
RISC-V (61)
Datapath WithJALR
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =I =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*
3/22/2021 63
RISC-V (62)
Datapath WithJALR
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =I =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*
3/22/2021 64
RISC-V (63)
3/22/2021 65
J-FormatforJumpInstructions
31 30 21 20 19 12 11 76 0
imm[20] imm[10:1] imm[11] imm[19:12] rd opcode
1 10 1 8 5 7
offset[20:1] dest JAL
▪ Two changes to the state
jal saves PC+4 in register rd (the return address)
Set PC = PC + offset (PC-relative jump)
▪ Target somewhere within ±219 locations, 2 bytes apart
±218 32-bit instructions
▪ Immediate encoding optimized similarly to branch instruction
to reduce hardware cost
3/22/2021 66
RISC-V (65)
DatapathwithJAL
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =J =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*
3/22/2021 67
RISC-V (66)
LightUpJAL Path
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =J =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*
3/22/2021 68
RISC-V (67)
3/22/2021 69
U-Formatfor“UpperImmediate”Instructions
31 12 11 76 0
imm[31:12] rd opcode
20 5 7
U-immediate[31:12] dest LUI
U-immediate[31:12] dest AUIPC
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =Read =1
Control logic =*
3/22/2021 71
RISC-V (70)
Lighting Up LUI
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =B =Read =1
Control logic =*
3/22/2021 72
RISC-V (71)
Lighting Up AUIPC
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =add =Read =1
Control logic =*
3/22/2021 73
RISC-V (72)
3/22/2021 74
CompleteRV32IDatapath!
+4
Add
pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
3/22/2021 75
RISC-V (74)
CompleteRV32I ISA!
Open Reference Card
Base Integer Instructions: RV32I
CategoryName Fmt RV32I Base Category Name Fmt RV32I Base
Shifts Shift Left Logical R SLL rd,rs1,rs2 Loads Load Byte I LB rd,rs1,imm
Shift Left Log. Imm. I SLLI rd,rs1,shamt Load Halfword I LH rd,rs1,imm
Shift Right Logical R SRL rd,rs1,rs2 Load Byte Unsigned I LBU rd,rs1,imm
Shift Right Log. Imm. I SRLI rd,rs1,shamt Load Half Unsigned I LHU rd,rs1,imm
Shift Right Arithmetic R SRA rd,rs1,rs2 Load Word I LW rd,rs1,imm
Shift Right Arith. Imm. I SRAI rd,rs1,shamt Stores Store Byte S SB rs1,rs2,imm
3/22/2021 77
RISC-V (76)
3/22/2021 78
Complete Single-Cycle RV32I Datapath!
+4
Add
pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
3/22/2021 79
Control and Status Registers
▪ Control and status registers (CSRs) are separate from the
register file (x0-x31)
Used for monitoring the status and performance
There can be up to 4096 CSRs
▪ Not in the base ISA, but almost mandatory in every
implementation
ISA is modular
Necessary for counters and timers, and communication
with peripherals
3/22/2021 80
CSRInstructions
31 20 19 1514 12 11 76 0
csr rs1 funct3 rd opcode
12 5 3 5 7
3/22/2021 83
System Instructions
▪ ecall – (I-format) makes requests to supporting
execution environment (OS), such as system calls
(syscalls)
▪ ebreak – (I-format) used e.g. by debuggers to transfer
control to a debugging environment
31 20 19 1514 12 11 76 0
funct12 rs1 funct3 rd opcode
12 5 3 5 7
Program
Address
Datapath
Program Counter (PC)
Bytes
Registers
Write Data
Data
Read Data
Arithmetic-Logic Output
Unit (ALU)
3/22/2021 86
Single-Cycle RV32I Datapath and Control
+4
Add
pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
3/22/2021 87
Exa mple: sw
+4
Add
pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=S = =* =* =1 = =add =Write =*
=pc+4 BrEq =* 0
Control logic 0
3/22/2021 88
Exa mple: beq
+4
Add
pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=B = =* BrEq =* =1 =1 =add =Read =*
Control logic 0
3/22/2021 89
3/22/2021 90
Exa mple: add
+4
Add
R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0
Inst[24:20] ALU mem
AddrB Comp addr
clk DataB
0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =0 =0
=pc+4 =* =1 =add =Read =1
=*
Control logic
3/22/2021 91
Add Execution +4
Add
R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
Inst[19:15] 0
1 DataB
pc inst AddrA DataA 0
Inst[24:20] Branch ALU mem
AddrB addr
DataB Comp
clk 0
IMEM Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
Clock
PC 1000 1004
Add
R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
Inst[19:15] 0
1 pc DataB
inst AddrA DataA 0
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB
0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =0 =0
=pc+4 =* =1 =add =Read =1
=*
Control logic
Add
R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
Inst[19:15] 0
1 pc AddrA DataB
inst DataA 0
Inst[24:20] Branch ALU mem
AddrB Comp addr
DataB
clk 0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]
PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =1 =0
=pc+4 =I =1 =add =Read =0
Control logic =*
Critical path = tclk-q +max {tAdd+tmux, tIMEM+tImm+tmux+tALU+tDMEM+tmux,
tIMEM+tReg+tmux+tALU+tDMEM+tmux}+tsetup
3/22/2021 94
Instruction Timing
I-MEM Reg Read ALU D-MEM Reg W Total
200 ps 100 ps 200 ps 200 ps 100 ps 800 ps
IF ID EX MEM WB
clock
PC old pc pc+4
Instr. fetch old instruction
Instr. decode old registerout
Execute old ALUresult
Memory Access old memory data
tIF tID tEX tMEM tWB
3/22/2021 95
Instruction Timing
Instr IF = 200ps ID = 100ps ALU = 200ps MEM=200ps WB = 100ps Total
add X X X X 600ps
beq X X X 500ps
jal X X X X 600ps
lw X X X X X 800ps
sw X X X X 700ps
3/22/2021 96
3/22/2021 97
Control Logic Truth Table
Inst[31:0] BrEq BrLT PCSel ImmSel BrUn ASel BSel ALUSel MemRW RegWEn WBSel
add * * +4 * * Reg Reg Add Read 1 ALU
sub * * +4 * * Reg Reg Sub Read 1 ALU
3/22/2021 99
RV32I, A Nine-Bit ISA!
▪ Instruction type encoded
using only 9 bits:
▪ inst[30],
inst[14:12],
inst[6:2]
inst[6:2]
inst[14:12]
inst[30]
3/22/2021 100
ROM-based Control
11-bit address (inputs)
Inst[30,14:12,6:2] BrEq BrLT
9
PCSel
3 ImmSel[2:0]
BrUn
ASel
ROM 4
BSel
ALUSel[3:0]
MemRW
RegWEn
2 WBSel[1:0]
3/22/2021 101
ROM Controller Implementation
AND OR
add
Control Word for add
sub
Inst[] or
Control Word for sub
BrEQ Control Word for or
BrLT .
. .
. .
Decoder
11
Address
.
jal
3/22/2021 103
Control Logic to Decode add
add = i[30]•i[14]•i[13]•i[12]•R-type
inst[30] inst[14:12] inst[6:2]
R-type = i[6]•i[5]•i[4]•i[3]•i[2]•RV32I
RV32I = i[1]•i[0]
3/22/2021 104
3/22/2021 105
Call home, we’ve made HW/SW contact!
High Level Language temp = v[k];
v[k] = v[k+1];
Program (e.g., C) v[k+1] = temp;
Compiler
lw x3, 0(x10)
Assembly Language lw x4, 4(x10)
sw x4, 0(x10)
Program (e.g., RISC-V) sw x3, 4(x10)
Assembler 1000 1101 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
1010 1110 0001 0010 0000 0000 0000 0000
Program (RISC-V) 1010 1101 1110 0010 0000 0000 0000 0100
Reg[rs1]
1
alu
pc+4
2
alu 1 ALU DMEM
(e.g., block diagrams) pc 0
inst[11:7] 1 wb
pc+4 0 IMEM AddrD Reg[rs2] Addr
DataR
Branch 0 0
inst[19:15] AddrA DataA
Comp. DataW mem
inst[24:20] AddrB DataB 1
Architecture Implementation
Logic Circuit Description
inst[31:7] Imm. imm[31:0]
Gen
C
D
3/22/2021 106
“And In conclusion…”
▪ We have built a processor!
Capable of executing all RISC-V instructions in one
cycle each
Not all units (hardware) used by all instructions
Critical path changes
▪ 5 Phases of execution
IF, ID, EX, MEM,WB
Not all instructions are active in all phases
▪ Controller specifies how to execute instructions
Implemented as ROM or logic
3/22/2021 107