Professional Documents
Culture Documents
Modulo13 RiscV DDCArv Ch7
Modulo13 RiscV DDCArv Ch7
Computer Architecture
Sarah Harris & David Harris
Chapter 7:
Microarchitecture
Chapter 7 :: Topics
• Introduction
• Performance Analysis
• SingleCycle Processor
• Multicycle Processor
• Pipelined Processor
• Advanced Microarchitecture
• Definitions:
– CPI: Cycles/instruction
– clock period: seconds/cycle
– IPC: instructions/cycle = IPC
• Challenge is to satisfy constraints of:
– Cost
– Power
– Performance
5 Digital Design & Computer Architecture Microarchitecture
RISCV Processor
• Consider subset of RISCV instructions:
– Rtype ALU instructions:
• add, sub, and, or, slt
– Memory instructions:
• lw, sw
– Branch instructions:
• beq
CLK CLK
CLK
WE3 WE
PCNext PC A1 RD1
A RD 5 32
32 32 32 32
A RD
Instruction 32 32
A2 RD2 Data
Memory 5 32
5
A3 Memory
Register
WD3 WD
32 File 32
SingleCycle
RISCV Processor
SingleCycle RISCV Processor
• Datapath
• Control
Example Program:
Address Instruction Type Fields Machine Language
imm11:0 rs1 f3 rd op
0x1000 L7: lw x6, -4(x9) I 111111111100 01001 010 00110 0000011 FFC4A303
imm11:5 rs2 rs1 f3 imm4:0 op
0x1004 sw x6, 8(x9) S 0000000 00110 01001 010 01000 0100011 0064A423
funct7 rs2 rs1 f3 rd op
0x1008 or x4, x5, x6 R 0000000 00110 00101 110 00100 0110011 0062E233
imm12,10:5 rs2 rs1 f3 imm4:1,11 op
0x100C beq x4, x4, L7 B 1111111 00100 00100 000 10101 1100011 FE420AE3
I-Type
31:20 19:15 14:12 11:7 6:0
imm11:0 rs1 funct3 rd op
12 bits 5 bits 3 bits 5 bits 7 bits
CLK CLK
CLK
WE3 WE
PCNext PC Instr A1 RD1
A RD
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
CLK CLK
CLK
19:15 WE3 0x2004 WE
PCNext PC Instr A1 RD1
A RD 9
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
CLK CLK
CLK
19:15 WE3 0x2004 WE
PCNext PC Instr A1 RD1
A RD 9
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
ImmExt
31:20
Extend 0xFFFFFFFC
0xFFC
ALU
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 SrcB 0x00002000 Data
Memory
A3 Memory
Register
WD3 WD
File
ImmExt
31:20
Extend 0xFFFFFFFC
0xFFC
ALU
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 SrcB 0x00002000 Data 10
Memory 11:7
A3 Memory
6 Register
WD3 WD
File
ImmExt
31:20
Extend 0xFFFFFFFC
0xFFC
RegWrite ALUControl2:0
1 000
CLK CLK
CLK
19:15 WE3 0x2004 SrcA WE
PCNext PC Instr A1 RD1
A RD 9 ReadData
ALUResult
ALU
0x1004
0x1000
0xFFC4A303
A RD
Instruction
A2 RD2 SrcB 0x00002000 Data 10
Memory 11:7
A3 Memory
6 Register
WD3 WD
File
ImmExt
31:7
Extend 0xFFFFFFFC
PCPlus4 0xFFC
+
+
SingleCycle
Datapath: Other
Instructions
SingleCycle Datapath: sw
• Immediate: now in {instr[31:25], instr[11:7]}
• Add control signals: ImmSrc, MemWrite
RegWrite ImmSrc ALUControl2:0 MemWrite
0 1 000 1
CLK CLK
CLK
19:15 WE3 0x2004 SrcA WE
PCNext PC Instr A1 RD1
A RD 9 ReadData
ALUResult
ALU
0x1008
0x1004
0x0064A423
10 A RD
Instruction 24:20
A2 RD2 SrcB 0x0000200C Data
Memory 11:7 6
A3 Memory
Register WriteData
WD3 WD
File
ImmExt
31:7
Extend 0x00000008
PCPlus4 0x008
+
+
I-Type
31:20 19:15 14:12 11:7 6:0
imm11:0 rs1 funct3 rd op
12 bits 5 bits 3 bits 5 bits 7 bits
S-Type
31:25 24:20 19:15 14:12 11:7 6:0
imm11:5 rs2 rs1 funct3 imm4:0 op
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
ALU
0x100C 14
0x1008
0x0062E233
10 A RD 1
Instruction 24:20 14
A2 RD2 0 SrcB Data
Memory 11:7 6
A3 1 10 Memory
4 Register WriteData
WD3 WD
14 File
ImmExt
31:7
Extend 0x00000008
PCPlus4
+
+
Result
4
ALU
1 0x1000
0x100C
0xFE420AE3
14 A RD 1
Instruction 24:20
A2 RD2 0 SrcB 0 Data
Memory 11:7 4
A3 1 14 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt 0x1000
31:7
Extend 0xFFFFFFF4
PCPlus4 0xFFA
+
Result
4
0x1010
S-Type
31:25 24:20 19:15 14:12 11:7 6:0
imm11:5 rs2 rs1 funct3 imm4:0 op
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
B-Type
31:25 24:20 19:15 14:12 11:7 6:0
imm12,10:5 rs2 rs1 funct3 imm4:1,11 op
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
CLK CLK
CLK
19:15 WE3 SrcA Zero WE
0 PCNext PC Instr A1 RD1 0
A RD ReadData
ALUResult
ALU
1 A RD 1
Instruction 24:20
A2 RD2 0 SrcB Data
Memory 11:7
A3 1 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt
31:7
Extend
PCPlus4
+
Result
4
SingleCycle
Control
SingleCycle Control
HighLevel View LowLevel View
Zero PCSrc
Branch
PCSrc
Control
ResultSrc ResultSrc
Unit
MemWrite Main MemWrite
Instr 6:0
op6:0 Decoder ALUSrc
op ALUControl2:0 5
14:12 ImmSrc1:0
funct3 ALUSrc
30 RegWrite
funct75 ImmSrc1:0
Zero Zero RegWrite ALUOp1:0
ALU
funct32:0 ALUControl2:0
Decoder
funct75
3 lw 1 00 1 0 1 0 00
35 sw 0 01 1 1 X 0 00
51 Rtype 1 XX 0 0 0 0 10
99 beq 0 10 0 0 X 1 01
Zero PCSrc
Branch
ResultSrc
Main MemWrite
op6:0 Decoder ALUSrc
5
ImmSrc1:0
RegWrite
ALUOp1:0
ALU
funct32:0 ALUControl2:0
Decoder
funct75
000 add
001 subtract
N
010 and
ALUControl0
011 or N
Cout
101 SLT +
[N-1] Sum
ZeroExt N N N N N
101 011 010 001 000
3 ALUControl
N
Result
ResultSrc
Main MemWrite
op6:0 Decoder ALUSrc
5
ImmSrc1:0
RegWrite
ALUOp1:0
ALU
funct32:0 ALUControl2:0
Decoder
funct75
ALUOp1:0
op5
ALU
funct32:0 ALUControl2:0
Decoder
funct75
51 Rtype 1 XX 0 0 0 0 10
PCSrc
Control
ResultSrc
Unit
MemWrite
6:0
op ALUControl2:0
14:12
funct3 ALUSrc
30
funct75 ImmSrc1:0
Zero RegWrite
CLK CLK
0 CLK 1 xx 0 010 0 0
19:15 WE3 SrcA Zero WE
0 PCNext PC Instr A1 RD1 0
A RD ReadData
ALUResult
ALU
1 A RD 1
Instruction 24:20
A2 RD2 0 SrcB Data
Memory 11:7
A3 1 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt
31:7
Extend
PCPlus4
+
Result
4
Extending the
SingleCycle
Processor
Extended Functionality: IType ALU
Enhance the singlecycle processor to handle IType
ALU instructions: addi, andi, ori, and slti
• Similar to Rtype instructions
• But second source comes from immediate
• Change ALUSrc to select the immediate
• And ImmSrc to pick the correct immediate
3 lw 1 00 1 0 1 0 00
35 sw 0 01 1 1 X 0 00
51 Rtype 1 XX 0 0 0 0 10
99 beq 0 10 0 0 X 1 01
19 Itype 1 00 1 0 0 0 10
19 Itype 1 00 1 0 0 0 10
PCSrc
Control
ResultSrc
Unit
MemWrite
6:0
op ALUControl2:0
14:12
funct3 ALUSrc
30
funct75 ImmSrc1:0
Zero RegWrite
CLK CLK
0 CLK 1 00 1 000 0 0
19:15 WE3 SrcA Zero WE
0 PCNext PC Instr A1 RD1 0
A RD ReadData
ALUResult
ALU
1 A RD 1
Instruction 24:20
A2 RD2 0 SrcB Data
Memory 11:7
A3 1 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt
31:7
Extend
PCPlus4
+
Result
4
ResultSrc1:0
Main MemWrite
Decoder
op6:0 ALUSrc
5
ImmSrc1:0
RegWrite
ALUOp1:0
ALU
funct32:0 ALUControl2:0
Decoder
funct75
I-Type B-Type
31:20 19:15 14:12 11:7 6:0 31:25 24:20 19:15 14:12 11:7 6:0
imm11:0 rs1 funct3 rd op imm12,10:5 rs2 rs1 funct3 imm4:1,11 op
12 bits 5 bits 3 bits 5 bits 7 bits 7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
S-Type J-Type
31:25 24:20 19:15 14:12 11:7 6:0 31:12 11:7 6:0
PCSrc
Control
ResultSrc1:0
Unit
MemWrite
6:0
op ALUControl2:0
14:12
funct3 ALUSrc
30
funct75 ImmSrc1:0
Zero RegWrite
CLK CLK
CLK
19:15 WE3 SrcA Zero WE
0 PCNext PC Instr A1 RD1
A RD ReadData 00
ALUResult
ALU
1 A RD 01
Instruction 24:20 10
A2 RD2 0 SrcB Data
Memory 11:7
A3 1 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt
31:7
Extend
PCPlus4
+
Result
4
SingleCycle
Performance
Processor Performance
Program Execution Time
= (#instructions)(cycles/instruction)(seconds/cycle)
= # instructions x CPI x TC
CLK CLK
CLK
19:15 WE3 SrcA Zero WE
0 PCNext PC Instr A1 RD1
A RD ReadData 00
ALUResult
ALU
1 A RD 01
Instruction 24:20 10
A2 RD2 0 SrcB Data
Memory 11:7
A3 1 Memory
Register WriteData
WD3 WD
File
PCTarget
+
ImmExt
31:7
Extend
PCPlus4
+
Result
4