You are on page 1of 30

inst.eecs.berkeley.

edu/~cs61c
CS61C : Machine Structures
Lecture 29 – RISC-V Datapath I

Instructors
Dan Garcia and Bora Nikolic
2018-10-29

Is artificial intelligence set to become art’s next medium?


AI artwork sells for $432,500 — nearly 45 times its high estimate — as
Christie’s becomes the first auction house to offer a work of art created
by an algorithm.
Signature:

CS61C L29 Datapath I Garcia, Nikolić, Fall 2018 © UCB


Review

•Use muxes to select among input


•S input bits selects 2S inputs
•Each input can be n-bits wide, indep of S
•Can implement muxes hierarchically
•ALU can be implemented using a mux
•Coupled with basic block elements
•N-bit adder-subtractor done using N 1-bit adders
with XOR gates on input
•XOR serves as conditional inverter
CS61C L29 Datapath I 2 Garcia, Nikolić, Fall 2018 © UCB
New-School Machine Structures
Software Hardware
• Parallel Requests
Warehouse Smart
Assigned to computer
Harness Scale Phone
e.g., Search “Cats”
Parallelism & Computer
• Parallel Threads Achieve High
Assigned to core Performance Computer
e.g., Lookup, Ads Core Core

• Parallel Instructions Memory (Cache)
>1 instruction @ one time Input/Output
e.g., 5 pipelined instructions Core
Today’s Lecture Instruction Unit(s) Functional
• Parallel Data Unit(s)
>1 data item @ one time A0+B0 A1+B1 A2+B2 A3+B3
e.g., Add of 4 pairs of words Main Memory
• Hardware descriptions Logic Gates
All gates working in parallel at same time
CS61C L29 Datapath I 3 Garcia, Nikolić, Fall 2018 © UCB
Our Single-Core Computer (w/o FPU, SIMD)

Processor Memory
Enable? Input
Read/Write
Control

Cache Program
Datapath Memory (including
Address cache) organized
PC Bytes around blocks,
which are typically
Write multiple words
Registers Data

Arithmetic & Logic Unit Data


ReadD Output
(ALU)
ata

Processor organized
around words and bytes
Processor‐Memory Interface I/O‐Memory Interfaces

CS61C L29 Datapath I 4 Garcia, Nikolić, Fall 2018 © UCB


The CPU

• Processor (CPU): the active part of the computer


that does all the work (data manipulation and
decision-making)

• Datapath: portion of the processor that contains


hardware necessary to perform operations
required by the processor (the brawn)
• Control: portion of the processor (also in
hardware) that tells the datapath what needs to be
done (the brain)

CS61C L29 Datapath I 5 Garcia, Nikolić, Fall 2018 © UCB


One-Instruction-Per-Cycle RISC-V Machine
• On every tick of the clock,
the computer executes
one instruction

pc • Current state outputs


drive the inputs to the
combinational logic,
whose outputs settles at
clock the values of the state
IMEM before the next clock edge
Combinational
Logic • At the rising clock edge,
all the state elements are
Reg[] updated with the
combinational logic
outputs, and execution
moves to the next clock
cycle
DMEM

CS61C L29 Datapath I 6 Garcia, Nikolić, Fall 2018 © UCB


Stages of the Datapath : Overview

• Problem: a single, “monolithic” block that “executes


an instruction” (performs all necessary operations
beginning with fetching the instruction) would be too
bulky and inefficient
• Solution: break up the process of “executing an
instruction” into stages, and then connect the
stages to create the whole datapath
– smaller stages are easier to design
– easy to optimize (change) one stage without touching the
others (modularity)

CS61C L29 Datapath I 7 Garcia, Nikolić, Fall 2018 © UCB


Five Stages of the Datapath

•Stage 1: Instruction Fetch (IF)


•Stage 2: Instruction Decode (ID)
•Stage 3: Execute (EX) - ALU (Arithmetic-Logic Unit)
•Stage 4: Memory Access (MEM)
•Stage 5: Write Back to Register (WB)

CS61C L29 Datapath I 8 Garcia, Nikolić, Fall 2018 © UCB


Basic Phases of Instruction Execution

rd

Reg[ ]
PC

rs1
IMEM

DMEM
ALU
rs2

+4 imm
mux

1. Instruction 2. Decode/ 5. Register


3. Execute 4. Memory
Fetch Register Write
Read
Clock

CS61C L29 Datapath I 9


time Garcia, Nikolić, Fall 2018 © UCB
Datapath Components: Combinational

•Combinational Elements
OP
CarryIn Select
A A
32 A 32
Adder

32

MUX

ALU
Sum Result
32 Y 32
32
B CarryOut B B
32 32 32

Adder Multiplexer ALU

•Storage Elements + Clocking Methodology


•Building Blocks

CS61C L29 Datapath I 10 Garcia, Nikolić, Fall 2018 © UCB


Datapath Elements: State and Sequencing (1/3)
Write Enable
• Register
Data In Data Out
• Write Enable: N N

•Negated (or deasserted) (0):


Data Out will not change clk

•Asserted (1): Data Out will become Data In on positive


edge of clock

CS61C L29 Datapath I 11 Garcia, Nikolić, Fall 2018 © UCB


Datapath Elements: State and Sequencing (2/3)

• Register file (regfile, RF) consists of 32 RW RA RB


Write Enable 5 5 5
registers: busA
•Two 32‐bit output busses: busA and busB busW 32 x 32‐bit 32
•One 32‐bit input bus: busW 32 Registers busB
Clk
•Register is selected by: 32
•RA (number) selects the register to put on busA (data)
•RB (number) selects the register to put on busB (data)
•RW (number) selects the register to be written
via busW (data) when Write Enable is 1
•Clock input (clk)
•Clk input is a factor ONLY during write operation
•During read operation, behaves as a combinational logic
block:
 RA or RB valid  busA or busB valid after “access time.”

CS61C L29 Datapath I 12 Garcia, Nikolić, Fall 2018 © UCB


Datapath Elements: State and Sequencing (3/3)
Write Enable Address
•“Magic” Memory
•One input bus: Data In Data In DataOut
32 32
•One output bus: Data Out Clk
•Memory word is found by:
•For Read: Address selects the word to put on Data Out
•For Write: Set Write Enable = 1: address selects the memory
word to be written via the Data In bus
•Clock input (CLK)
•CLK input is a factor ONLY during write operation
•During read operation, behaves as a combinational logic
block: Address valid  Data Out valid after “access time”

CS61C L29 Datapath I 13 Garcia, Nikolić, Fall 2018 © UCB


Administrivia

CS61C L29 Datapath I 14 Garcia, Nikolić, Fall 2018 © UCB


State Required by RV32I ISA
Each instruction reads and updates this state during execution:
•Registers (x0..x31)
• Register file (regfile) Reg holds 32 registers x 32 bits/register: Reg[0]..Reg[31]
• First register read specified by rs1 field in instruction
• Second register read specified by rs2 field in instruction
• Write register (destination) specified by rd field in instruction
• x0 is always 0 (writes to Reg[0]are ignored)
•Program Counter (PC)
• Holds address of current instruction
•Memory (MEM)
• Holds both instructions & data, in one 32-bit byte-addressed memory space
• We’ll use separate memories for instructions (IMEM) and data (DMEM)
 These are placeholders for instruction and data caches
• Instructions are read (fetched) from instruction memory (assume IMEM read-only)
• Load/store instructions access data memory
CS61C L29 Datapath I 15 Garcia, Nikolić, Fall 2018 © UCB
Review: Complete RV32I ISA

Not in CS61C

•Need datapath and control to implement these instructions

CS61C L29 Datapath I 16 Garcia, Nikolić, Fall 2018 © UCB


Implementing the add instruction
31 25 24 20 19 15 14 12 11 76 0
funct7 rs2 rs1 funct3 rd opcode
7 5 5 3 5 7

0000000 rs2 rs1 000 rd 0110011

add rs2 rs1 add rd Reg-Reg OP


add rd, rs1, rs2
•Instruction makes two changes to machine’s state:
•Reg[rd] = Reg[rs1] + Reg[rs2]
•PC = PC + 4

CS61C L29 Datapath I 17 Garcia, Nikolić, Fall 2018 © UCB


Datapath for add
PC = PC + 4 Reg[rd] = Reg[rs1] + Reg[rs2]

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB DataB ALU
clk
IMEM Reg [ ]

Inst[31:0] clk

RegWriteEnable (RegWEn)
=1
Control logic

31 25 24 20 19 15 14 12 11 76 0
0000000 rs2 rs1 000 rd opcode
add 5 5 add 5 Reg-Reg OP
CS61C L29 Datapath I 18 Garcia, Nikolić, Fall 2018 © UCB
Timing Diagram for add
+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
+
Inst[24:20] Reg[rs2]
AddrB DataB ALU
clk
IMEM Reg [ ]

clock
Inst[31:0] clk
RegWEn
time
Clock
PC 1000 1004
PC+4 1004 1008
inst[31:0] add x1,x2,x3 add x6,x7,x9

Reg[rs1] Reg[2] Reg[7]

Reg[rs2] Reg[3] Reg[9]

alu Reg[2]+Reg[3] Reg[7]+Reg[9]

Reg[1] ??? Reg[2]+Reg[3]


CS61C L29 Datapath I 19 Garcia, Nikolić, Fall 2018 © UCB
Implementing the sub instruction
31 25 24 20 19 15 14 12 11 76 0
0000000 rs2 rs1 000 rd 0110011 add
0100000 rs2 rs1 000 rd 0110011 sub

sub rd, rs1, rs2


•Almost the same as add, except now have to subtract
operands instead of adding them
•inst[30] selects between add and subtract

CS61C L29 Datapath I 20 Garcia, Nikolić, Fall 2018 © UCB


Datapath for add/sub

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB DataB ALU
clk
IMEM Reg [ ]

clk

Inst[31:0] RegWEn ALUSel


(1=Write, 0=NoWrite) (add=0/sub=1)
Control logic

CS61C L29 Datapath I 21 Garcia, Nikolić, Fall 2018 © UCB


Implementing other R-Format instructions
0000000 rs2 rs1 000 rd 0110011 add
0100000 rs2 rs1 000 rd 0110011 sub
0000000 rs2 rs1 001 rd 0110011 sll
0000000 rs2 rs1 010 rd 0110011 slt
0000000 rs2 rs1 011 rd 0110011 sltu
0000000 rs2 rs1 100 rd 0110011 xor
0000000 rs2 rs1 101 rd 0110011 srl
0100000 rs2 rs1 101 rd 0110011 sra
0000000 rs2 rs1 110 rd 0110011 or
0000000 rs2 rs1 111 rd 0110011 and

•All implemented by decoding funct3 and funct7


fields and selecting appropriate ALU function
CS61C L29 Datapath I 22 Garcia, Nikolić, Fall 2018 © UCB
Implementing I-Format - addi instruction

•RISC-V Assembly Instruction:


addi x15,x1,-50
31 20 19 15 14 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 5 3 5 7

111111001110 00001 000 01111 0010011


imm=-50 rs1=1 add rd=15 OP-Imm

CS61C L29 Datapath I 23 Garcia, Nikolić, Fall 2018 © UCB


Datapath for add/sub

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB DataB ALU
clk
IMEM Reg [ ]

clk

Immediate should
be here
Inst[31:0] RegWEn ALUSel
(1=Write, 0=NoWrite) (add=0/sub=1)
Control logic

CS61C L29 Datapath I 24 Garcia, Nikolić, Fall 2018 © UCB


Adding addi to Datapath

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB DataB 0 ALU
clk
IMEM Reg [ ] 1

clk
Imm[31:0]

Inst[31:0] BSel ALUSel


RegWEn
(rs2=0/ (add=0/
(1=Write, 0=NoWrite)
Control logic Imm=1) sub=1)

CS61C L29 Datapath I 25 Garcia, Nikolić, Fall 2018 © UCB


Adding addi to Datapath

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB DataB 0 ALU
clk
IMEM Reg [ ] 1

Inst Bsel = 1
[31:20] clk
Imm.
Gen Imm[31:0]

Inst[31:0] RegWEn=1 BSel ALUSel=


ImmSel
(rs2=0/ add
=I
Control logic Imm=1)

CS61C L29 Datapath I 26 Garcia, Nikolić, Fall 2018 © UCB


I-Format immediates
-inst[31]-
31 30 20 19 15 14 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12
inst[31:0]

------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]
inst[31:20] imm[31:0] • High 12 bits of instruction (inst[31:20]) copied
Imm.
Gen to low 12 bits of immediate (imm[11:0])
• Immediate is sign-extended by copying value
ImmSel=I of inst[31] to fill the upper 20 bits of the
immediate value (imm[31:12])
CS61C L29 Datapath I 27 Garcia, Nikolić, Fall 2018 © UCB
R+I Datapath

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
AddrB ALU
clk DataB 0
Works for all other I-format
IMEM Reg [ ] 1
arithmetic instructions
Inst
[31:20] Imm.
clk (slti,sltiu,andi,
Gen Imm[31:0] ori,xori,slli,srli,
srai) just by
changing ALUSel
Inst<31:0> ImmSel RegWEn BSel ALUSel

Control logic

CS61C L29 Datapath I 28 Garcia, Nikolić, Fall 2018 © UCB


Peer Instruction

1) We should use the main ALU to


compute PC=PC+4 in order to 123
FFF
save some gates FFT
FTF
2) The ALU is a synchronous state FTT
element TFF
TFT
3) Program counter is a register TTF
TTT

CS61C L29 Datapath I 29 Garcia, Nikolić, Fall 2018 © UCB


“And In conclusion…”

• CPU design involves datapath and control logic


• Typical five stages of exectution
• We built a datapath for arithmetic and logic
for RISC-V R and I instructions
• And sketched out control
• Will add data memory access (loads, stores),
branches, jumps, etc…

CS61C L29 Datapath I 30 Garcia, Nikolić, Fall 2018 © UCB

You might also like