Professional Documents
Culture Documents
February 2017
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 2/97
Contents
Introduction to Chapter 7 of the textbook
Components of synchronous, sequential logic systems
The PC, Register File, and Memory Units
A model for Register File internals
Textbook Section 7.2: Performance Analysis
Single-cycle processor: Overview
Details of datapaths for the single-cycle machine
Control for the single-cycle machine
Single-cycle timing example: LW instruction
More instructions, and next steps
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 3/97
D Q D Q
The state Q copies the input The state Q copies the input
D on each rising clock edge. D on each falling clock edge.
(The “bubble” symbol indicates inversion of a signal.)
All of the DFFs we saw in ENEL 353 in Fall 2016 were
positive-edge-triggered. We’ll see in Section 7.5 that sometimes
it’s useful to have state updates on negative clock edges.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 13/97
D flip-flops: Applications
CLK
A 32-bit register
D31 Q31 This gets updated once per clock
cycle on positive clock edges.
Each DFF receives the same CLK
D30 Q30 input.
.. .. .. The diagram on the left shows the
. . . structure but is awkward to draw,
so we’ll use this compact symbol:
D1 Q1 CLK
D31:0 Q31:0
D0 Q0 32 32
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 17/97
Wires
A wire connects an output bit of some element to the input
bit(s) of one or more elements.
To keep things simple, we’ll model signalling over wires as
happening without delay. But keep in mind that in
real-world design of high-speed circuits, accounting for wire
delays can be very important.
The PC
CLK
PC0 PC
32 32
Instruction Memory
32
A RD 32
Instruction
Memory
A1 WE3 RD1
5 32
A2 RD2
5 32
Register
5
A3 File
WD3
32
A1 WE3 RD1
5 32
A2 RD2
5 32
Register
5
A3 File
WD3
32
Data Memory
CLK
WE
A RD
32 32
Data
Memory
WD
32
Again, note the CLK input. This, like the Register File, is a
synchronous sequential element. State updates can happen
only on rising edges of CLK.
Let’s make some notes about how this element behaves.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 27/97
Abbrevs
D31:0 Q31:0
D0 Q0 32
EN
32
EN
registerEN registerEN
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 33/97
CLK
D31:0 Q31:0
32 32
EN
registerEN
A1 WE3 RD1
5 32
A2 RD2
5 32
Register
5
A3 File
WD3
32
CLK
GPR3131:0
32 EN 32
Y31
GPR3031:0
32 EN 32
Y30 .. ..
. .
R-File WD3 input
32
GPR0231:0
32 EN 32
Y2
GPR0131:0
32 EN 32
Y1
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 38/97
With the goal of saving some time, we’re not going to go into
detail here.
One possible arrangement is to use two (large!) 32-bit
32:1 bus multiplexers. The 5-bit select inputs to the bus
muxes would be the A1 and A2 R-File inputs. The first bus
mux would use A1 to select one of 32 32-bit GPR values to
copy to the RD1 output of the R-File, and the second bus mux
would do the same thing with A2 and RD2.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 39/97
A1 WE3 RD1
5 32
A2 RD2
5 32
Register
5
A3 File
WD3
32
Reminder:
I The write logic is sequential—a GPR update can only
happen in response to an active clock edge.
I The read logic is combinational—when A1 or A2
change, RD1 or RD2 will change without waiting for a
clock edge.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 40/97
CLK
Datapath generates result(s) of Result(s)
current instruction. ready.
The first datapath we’ll look at is the datapath for LW. After
that we’ll move on to SW, R-type instructions, and BEQ.
Before we start on LW, we’ll need a few more datapath
elements—32-bit adders, a 16-to-32-bit sign-extend unit, and
a 32-bit ALU (arithmetic/logic unit).
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 51/97
Things to note:
I The carry-in to the LSB is 0.
ALUControl
3
A31:0 Zero
32 ALU
B31:0 Y31:0
32 32
ALUControl
3
A31:0 Zero
32 ALU
B31:0 Y31:0
32 32
ALU examples
ALUControl
3
A31:0 Zero
32 ALU
B31:0 Y31:0
32 32
For each of the examples in the table, what will the outputs
Y and Zero be?
example A B ALUControl
(1) 0x0000_0002 0x0000_0003 001
(2) 0x0000_0002 0x0000_0003 010
(3) 0xffff_ffff 0x0000_0000 111
(4) 0x0000_002a 0x0000_002a 110
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 57/97
SW
31 26 25 21 20 16 15 0
pointer source
101011 offset
GPR GPR
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 59/97
instruction fields
instruction
25:21
address Instr
32 32
A RD 32 20:16
Instruction 20:16
PC Memory
15:11
15:0
A1 WE3 RD1 WE
5 32 from
A RD
A2 RD2 ALU 32 32
5 32 Data
Register Memory
Instr20:16 A3 File
5 WD
WD3 32
32
LW datapath: PC update
At the same time LW is doing its job of copying a word from
Data Memory to the Register File, an update to the PC must
be generated. What does the symbol 4 mean in this
schematic?
CLK
32 32 32
to I-Mem
0
PC PC
32 +
4 32
32
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 63/97
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD ALUResult ReadData
ALU
1 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
Image is Figure 7.11 from Harris D. M. and Harris S. L., Digital Design
and Computer Architecture, 2nd ed.,
c 2013, Elsevier, Inc.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 65/97
instruction fields
instruction
25:21
address Instr
32 32
A RD 32 20:16
Instruction 20:16
PC Memory
15:11
15:0
A31:0 0 C4:0 0
32 F31:0 5 G4:0
B31:0 1 32 D4:0 1 5
32 5
A31:0 if S = 0 C4:0 if S = 0
F31:0 = G4:0 =
B31:0 if S = 1 D4:0 if S = 1
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 69/97
R-type instructions
GPR read and ALU use for LW, SW, and R-type
CLK ALUSrc ALUControl
3
Instr25:21 A1 WE3 RD1 Zero
5 ALU
Instr20:16 A2 RD2 0 ALUResult
5 1
Register
5
A3 File to D-Mem WD input
WD3
32
Behaviour:
if source GPRs are equal
PC0 = (PC + 4) + 4 × sign-extended offset
else
PC0 = PC + 4
We already have a datapath to compute PC + 4. We’ll need to
add features to get (PC + 4) + 4 × sign-extended offset.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 75/97
0 0
PC PC Instr25:21 A1 WE3 RD1 Zero
5 ALU
1 Instr20:16 A2 RD2 0
5 1
Register
5
A3 File
+ WD3
4 32
<< 2
Instr15:0 Sign Extend
+
16
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 77/97
MemtoReg
Main
MemWrite
Decoder Let’s write some rules
Branch
Instr31:26 for the 2-bit ALUOp
ALUSrc signal.
RegDst
What are the
RegWrite dimensions for each
2
part, if each of the
ALUOp
two parts is a ROM?
Instr5:0 ALU ALUControl
Decoder 3
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 83/97
MemtoReg
Main
MemWrite
Decoder
Branch For this example SLT
Instr31:26
ALUSrc instruction, what
does the Main
RegDst
Decoder do?
RegWrite
What does the ALU
2 ALUOp Decoder do?
Instr5:0 ALU ALUControl
Decoder 3
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 84/97
MemtoReg
Main
MemWrite
Decoder
Branch For this example LW
Instr31:26
ALUSrc instruction, what
does the Main
RegDst
Decoder do?
RegWrite
What does the ALU
2 ALUOp Decoder do?
Instr5:0 ALU ALUControl
Decoder 3
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 85/97
MemtoReg
Main
MemWrite
Decoder
Branch For this example BEQ
Instr31:26
ALUSrc instruction, what
does the Main
RegDst
Decoder do?
RegWrite
What does the ALU
2 ALUOp Decoder do?
Instr5:0 ALU ALUControl
Decoder 3
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 86/97
MemtoReg
MemWrite
RegWrite
ALUSrc
ALUOp
RegDst
Branch
Instruction
R-type 1 1 0 0 0 0 10
LW 1 0 1 0 0 1 00
SW 0 X 1 0 1 X 00
BEQ 0 X 0 1 0 X 01
Exercise: Make a blank version of this table, then fill it in by
looking at Figure 7.11 and deciding what all the signal values
should be.
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 87/97
PC output
Instruction
R-File outputs
ALU result
D-Mem RD output
$s1 contents
CLK
PC output
Instruction
R-File outputs
ALU result
D-Mem RD output
$s1 contents
1 2 3 4 5 6 7
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 90/97
Which clock speeds work for LW, and which ones do not?
“fast”
clock
1 2 3 4 5 6
“slow”
clock
1 2 3 4 5 6
“medium”
clock
1 2 3 4 5 6
ENCM 369 Winter 2017 Slide Set 6 for Lecture Section 01 slide 91/97
It’s assumed that I-Mem and D-Mem have the same delay,
tmem , so the overall combinational delay simplifies to
It’s assumed that R-File updates will work correctly if its WD3
(write data) input is ready no later than tRFsetup (R-File setup
time) in advance of a rising clock edge.
So for safe operation of an LW instruction:
Moving on