Professional Documents
Culture Documents
1
5.1
Introduction
A Basic MIPS Implementation
• We're ready to look at an implementation of the MIPS
• Simplified to contain only:
– memory-reference instructions: lw, sw
– arithmetic-logical instructions: add, sub, and, or, slt
– control flow instructions: beq, j
• Generic Implementation:
– use the program counter (PC) to supply instruction address
– get the instruction from memory
– read registers
– use the instruction to decide exactly what to do
• All instructions use the ALU after reading the registers
Why? memory-reference? arithmetic? control
flow?
2
• Memory Reference Instruction uses
the ALU for an Address Calculation.
• ALU Reference Instruction uses the
ALU for the Operation Execution.
• Branch Instruction uses ALU for
Comparison.
• A memory Reference Instruction has access the memory
either to read data for a load (or) write data for store.
• An ALU Instruction (or) Load Instruction has to write the
data from ALU (or) memory back in to register.
• Finally for branch instruction, we may need to change the
next instruction address based on the comparison,
otherwise we have to increment the PC by 4 to get the
address of the next instruction.
An Overview of the Implementation
• Missing Multiplexers, and some Control lines for read and write.
5
An Overview of the Implementation
• Datapath
– Elements that process data and addresses within the CPU
• Register file, ALUs, Adders, Instruction and Data
Memories, …
We need functional units (datapath elements) for:
1. Fetching instructions and incrementing the PC.
2. Execute arithmetic-logical instructions: add, sub, and, or, and slt
3. Execute memory-reference instructions: lw, sw
4. Execute branch/jump instructions: beq, j
1. Fetching instructions and incrementing the PC.
8
Continue
9
Continue
10
4. Execute branch/jump instructions: beq, j
11
Creating a Single Datapath
12
Continue
13
5.4 A Simple Control Implementation
Scheme
The ALU Control
14
Designing the Main Control Unit
15
Continue
16
Continue
17
Finalizing the Control
18
Continue
19
Continue
20
Example: Implementing Jumps
21
Why a Single-Cycle Implementation Is Not Used Today
22
Continue
memory (200ps),
ALU and adders (100ps),
register file access (50ps)
CPU clock cycle (option 2) = 400 45% + 60025% + 550 10% + 350 15% + 2005%
= 447.5 ps.
600
Performance ratio =
447.5 1.34
21
5.5 A Multicycle Implementation
24
Continue
Replacing the three ALUs of the single-cycle by a single ALU means that the
single ALU must accommodate all the inputs that used to go to the three
different ALUs.
25
Continue
26
Continue
27
Continue
28
Breaking the Instruction Execution into Clock Cycles
IR <= Memory[PC];
PC <= PC + 4;
29
Breaking the Instruction Execution into Clock Cycles
IR <= Memory[PC];
To do this, we need:
MemRead ➔Assert
IRWrite ➔ Assert
IorD ➔ 0
--------------------------
-----
PC <= PC + 4;
ALUSrcA ➔ 0
ALUSrcB ➔ 01
ALUOp ➔ 00 (for add)
PCSource ➔ 00
PCWrite ➔ set
The increment
of the PC and
30
Breaking the Instruction Execution into Clock Cycles
A <= Reg[IR[25:21]];
B <= Reg[IR[20:16]];
ALUOut <= PC + (sign-extend(IR[15-0] << 2 );
31
A <= Reg[IR[25:21]];
B <= Reg[IR[20:16]];
Since A and B are
overwritten on every
cycle ➔ Done
The register file access and computation of branch target occur in parallel.
32
Breaking the Instruction Execution into Clock Cycles
Memory reference:
ALUOut <= A + sign-extend(IR[15:0]);
Arithmetic-logical instruction:
ALUOut <= A op B;
Branch:
if (A == B) PC <= ALUOut;
Jump:
PC <= { PC[31:28], (IR[25:0], 2’b00) };
33
Memory reference:
ALUOut <= A + sign-extend(IR[15:0]);
ALUSrcA = 1 && ALUSrcB = 10
ALUOp = 00
Arithmetic-logical instruction:
ALUOut <= A op B;
ALUSrcA = 1 && ALUSrcB = 00
ALUOp = 10
Branch:
if (A == B) PC <= ALUOut;
ALUSrcA = 1 && ALUSrcB = 00
ALUOp = 01 (for subtraction)
PCSource = 01
PCWriteCond is asserted
Jump:
PC <= { PC[31:28],
(IR[25:0],2’b00) };
PCWrite is asserted
PCSource = 10
34
Breaking the Instruction Execution into Clock Cycles
Memory reference:
MDR <= Memory [ALUOut]; MemRead, IorD=1
or
Memory [ALUOut] <= B; MemWrite, IorD=1
35
Breaking the Instruction Execution into Clock Cycles
36
Continue
37
Defining the Control
38
Example Continue
CPI
CPU clock cycles
Instruction count i CPIi
Instruction count
CPI Instruction counti i
CPI
The ratio
Instruction counti
Instruction count
is simply the instruction frequency for the instruction class i. We can therefore
substitute to obtain:
This CPI is better than the worst-case CPI of 5.0 when all instructions take the
same number of clock cycles.
39
Defining the Control (Continue)
40
Defining the Control (Continue)
41
Defining the Control (Continue)
42
5.6 Exceptions
• Exceptions
• Interrupts
43
How Exception Are Handled
44
How Control Checks for Exception
45
Continue
The finite state machine with the additions to handle exception detection
47