Professional Documents
Culture Documents
FSM Design
FSM Design
2
Sequential Systems
• Every synchronous sequential design can be
classified as a “finite-state” machine
- all that really means is that it has a finite number of flip-flop/
registers in the design
• Intel Itanium2 has more than 227 state bits and more
than 2(2^27) distinguishable states
- No. There is not a gigantic state-transition diagram somewhere
- Yes. There are control FSMs (the way you understand FSMs now)
✦ may be many tens of states in the largest ones
✦ more complicated control exists as many cooperating FSMs
- The majority of the “logic” is in the datapath
✦ very stylistic usage of state and logic
✦ a very different way of design
3
Motivation: serial adder example
• Add two N-bit numbers in serial
- unsigned numbers appear (A , B )
i i
done
...Ai
Cout
...Bi
sum
N
Clk
0,0
init "0+0" 0,1 "00+
0,1 01"
1,1 1,0
"01+
1,0 "0+1"
00"
"01+
1,1 0,0 01"
"1+0"
"1+1"
"00+ 0,0 "000+
0,1 10" 100"
0,1
1,0 1,0
"00+1 "000+
1" 101"
1,1 1,1
"01+ “001+
10" 100"
"01+1 “001+
1" 101"
ci co
……..ai
Full
……..bi Adder
s
sum
• Still need something to “control the system”
• Use a FSM for control
- at reset, carry flip flop gets cleared (perhaps also the sum bits)
- you need to “count” from 0 to N-1 after reset
- when FSM is N-1, you need to signal “done”
- only N states and log N state bits!!
2
6
These are called “datapaths”
Input
• Datapaths 6
6'b1
6
- Is transformed or selected by 6
16
- sum = sum
• What is “sum” in this case? Adder
The state of a
computation 16
sum=0; 16
InputA
A 16
ld_L = 1
cl_L = 0
datapath
control
Adder
B
ld_L = 0 16
cl_L = 1
ld_L D
clock
C cl_L Q
ld_L = 0
16
cl_L = 1
sum=0;
for (i=0, i<2; i++)
sum += InputA;
16
C D Q
A
A Clk
ld_L D
ld_L = 1
cl_L = 0 clock
A D Q
B
cl_L Q
Clk
16
B C
B D Q
ld_L = 0
cl_L = 1 Clk
reset_L
C
ld_L = 0
cl_L = 1
11
Generalize on what we just did
• FSM-D — A finite state machine with a datapath
- The finite state machine is what we’ve been studying
- A datapath is combinational logic and registers that can do
computation (sometimes spelled data-path, or data path)
- What senses and controls the computation?
✦ The FSM
FSM Datapath
clock
reset
12
What’s this Look Like in SysVerilog?
module top
#(parameter W = 16) InputA
cl_L Q
fsm c1 (clock, rst_L, ld_L, cl_L);
16
endmodule: top
always_comb
unique case (cs) reset_L
A: begin //load zero
ns = B;
cl_L = 0; ld_L = 1; A
end ld_L = 1
B: begin //add input cl_L = 0
ns = C;
cl_L = 1; ld_L = 0;
end
C: begin //add input B
ld_L = 0
ns = A;
cl_L = 1
cl_L = 1; ld_L = 0;
end
endcase
C
always_ff @(posedge clk, negedge rst_L) ld_L = 0
if (~rst_L) cs <= A; cl_L = 1
else cs <= ns;
endmodule: fsm 14
Termination
• Our solution doesn't exactly match our specification
- i.e. the code snippet
sum=0;
for (i=0, i<2; i++)
sum += InputA;
Computes
A
ld_L = 1
eternally A
ld_L = 1
Computes once
cl_L = 0 cl_L = 0
Holds answer eternally
B B
ld_L = 0 ld_L = 0
cl_L = 1 cl_L = 1
C C Stop
ld_L = 0 ld_L = 0 ld_L = 1
cl_L = 1 cl_L = 1 cl_L = 1
15
A Thorough Example
Ones Counter
clr_L
5
A B
This register will count
from zero to 30. The comparator
This register will count comparator will tell us
the number of 1s we see when we’re done
eq
17
Start Piecing the System Together
• Datapath inputs and outputs
d_in 30
clr_L
lowbit 5
d_in_
ready
d_out
clr_L
inc_L Shift Count Register
clock
5
5'd30 d_out_
reset FSM ready
A B
comparator
done
18
Start Piecing the System Together
• Datapath control points
- Inputs to the datapath used by FSM to control the datapath
d_in 30
clr_L
lowbit 5
d_in_
ready
d_out
clr_L
inc_L Shift Count Register
clock
5
5'd30 d_out_
reset FSM ready
A B
comparator
done
19
Start Piecing the System Together
• Datapath status points
- Values in datapath used by the FSM on state transitions
d_in 30
clr_L
lowbit 5
d_in_
ready
d_out
clr_L
inc_L Shift Count Register
clock
5
5'd30 d_out_
reset FSM ready
A B
comparator
done
20
Start Piecing the System Together
• Hook up the rest of the inputs and outputs
d_in 30
clr_L
lowbit 5
d_in_
ready
d_out
clr_L
inc_L Shift Count Register
clock
5
5'd30 d_out_
reset FSM ready
A B
comparator
done
21
The FSM — state by state
Cclr_L
Reset
Cinc_L Shift Count Register
~ d_in_ready
5'd30
5
SC
A
A B
comparator
done
d_in_ready /
Cclr_L, Sload_L, Oclr_L 30
Sload_L D
Sshift_L Shift Register
B lowbit
Oclr_L
When we get to this Oinc_L Ones Count Register
state, what will be
the values in the
5
registers?
Note: we’re only showing the output signals asserted in each state
22
FSM — arc by arc
Cclr_L
Reset
Cinc_L Shift Count Register
~ d_in_ready
5'd30
5
SC
A
A B
comparator
done
d_in_ready /
Cclr_L, Sload_L, Oclr_L 30
Sload_L D
Sshift_L Shift Register
Oclr_L
Oinc_L Ones Count Register
If the low bit is 1, and the shift
count is not 30, increment the 5
counters and shift
23
FSM — arc by arc
Cclr_L
Reset
Cinc_L Shift Count Register
~ d_in_ready
5'd30
5
SC
A
A B
comparator
done
d_in_ready /
30
Cclr_L, Sload_L, Oclr_L
Sload_L D
Sshift_L Shift Register
Oclr_L
~lowBit & (~ done) / Oinc_L Ones Count Register
Cinc_L, Sshift_L
If the low bit is 0 and the 5
shift count is not 30, inc the
shift counter and shift. Don’t
enable One’s Count
24
And a final arc
Cclr_L
Reset
Cinc_L Shift Count Register
~ d_in_ready
5'd30
5
SC
A
A B
comparator
done
done /
D_out_ready d_in_ready / 30
Cclr_L, Sload_L, Oclr_L
Sload_L D
Sshift_L Shift Register
Oclr_L
~lowBit & (~ done) / Oinc_L Ones Count Register
Cinc_L, Sshift_L
5
When the shift count is 30,
signal D_out_ready
25
Specify the Main Module
module OnesCount
#(parameter w = 30)
(input logic d_in_ready,
input logic clock, reset,
output logic d_out_ready, $clog2(i) is a system
input logic [w-1:0] d_in, function (indicated by
output logic [$clog2(w)-1:0] d_out); the “$”) that calculates
// ceiling of log2 of w the ceiling of log2(i)
//instantiate FSM and Datapath components here
endmodule: OnesCount
Ones Counter
d_in_ready d_out_ready
d_in d_out
FSM Datapath
clock
reset
26
FSM SystemVerilog: State A
module fsm #(…) (clock, reset, … );
endmodule: fsm 27
FSM SystemVerilog: State B
module fsm #(…) (clock, reset, … );
Reset
enum logic {A = 1'b0, B = 1'b1} ~ d_in_ready
cur_state, n_state;
A
always_comb begin
case (cur_state)
A: begin //State A done /
… D_out_ready d_in_ready /
B: begin //State B Cclr_L, Sload_L, Oclr_L
n_state = (done)? A : B;
dor = (done)? 1 : 0;
Cclr_L = 1;
Sload_L = 1; lowBit & (~ done) /
B
Oclr_L = 1; Oinc_L, Cinc_L, Sshift_L
Cinc_L = (done) ? 1 : 0;
Sshift_L = (done) ? 1 : 0;
Oinc_L = (done)? 1:~lowBit; ~lowBit & (~ done) /
end Cinc_L, Sshift_L
endcase
end
…
endmodule: fsm
28
More…
module OnesCount module fsm
#(parameter w = 30) #(parameter w = 30)
(input logic d_in_ready, clock, reset, (input logic clock, reset, done,
input logic [w-1:0] d_in, input logic d_in_ready, lowBit,
output logic dor, input logic [$clog2(w):0] SC,
output logic [$clog2(w)-1:0] d_out); output logic Cclr_L, Cinc_L,
Sload_L, Sshift_L,
logic lowBit, done, Cclr_L, Cinc_L; Oclr_L, Oinc_L, dor);
logic Sload_L, Sshift_L, Oclr_L;
logic Oinc_L, dor; enum logic {A = 1'b0, B = 1'b1}
cur_state, n_state;
logic [$clog2(w)-1:0] SC;
always_comb begin
fsm #(w) control (.*); case (cur_state)
A: begin //State A
ShiftReg_PISO_Right #(w) sr (lowBit, …
d_in, clk, Sload, Sshift); endcase
end
counter #($clog2(w)) sc (clock, Cclr_L,
Cinc_L, SC); always_ff @(posedge clock,
posedge reset)
compare #($clog2(w)) cmp (, done, , SC, begin
'd30); if (reset) cur_state <= A;
else cur_state <= n_state;
counter #($clog2(w)) oct (clock, end
Oclr_L, Oinc_L, d_out);
endmodule: OnesCount endmodule: fsm
Trace a Transition
CLK Reset
~ d_in_ready
dinready
Cclr_L A
Sload
30
An Alternate Approach
Reset
~ d_in_ready
• Lose the Shift Count Reg/Comp
- Let’s just have 30 states where we do the
A shifting and then just return to the A state
- I got tired, and didn’t draw all 30 of them!
d_in_ready /
Cclr_L, Sload_L, Oclr_L
30
Sload_L D
B
Sshift_L Shift Register
Oclr_L
C
Oinc_L Ones Count Register
~lowBit / lowBit /
Sshift_L Oinc_L, Sshift_L 5
etc OC
31
How Different Will This Be?
• Pick a state encoding
- How about A = 00000, B = 00001, C = 00010, D = 00011, E = 00100, …
- How would our design be different?
32
Two Approaches
• These two state transition diagrams suggest two
ways of envisioning a controller
- Exclude all of the states where you are just counting from the FSM
✦ Treat the counter as something else to control and monitor for
when it’s done
- Include all of the states, i.e. ones where you’re just counting, in the
FSM
✦ This was our second approach
• Comparison
- Excluding the counter states
✦ Smaller, simpler FSM to design — functional partitioning
✦ Give the synthesis tool a smaller thing to design
- Including all of the states
✦ Bigger, more complex FSM to design
✦ Let a synthesis tool wrestle with a state encoding
33
Cooperating FSMs
• Turns out, they’re about the same
• One view: Cooperating FSMs
- Control FSM + shift-count FSM
✦ two separate FSMs each with simple-to-think-about control
sequences
✦ compose them to form the more elaborate control sequences
34
More Alternates
• Don’t use a shift register
Oclr_L
Oinc_L Ones Count Register
Cclr_L load_L D
Cinc_L Bit Count Register DIN Register
5 30
BCount
5'd30 32-to-1 MUX
sel
A B
comparator
done
selected bit to FSM “lowBit”
35
Here’s Another
• Who needs a state machine?
30
DIN Register
network of adders
- Load the DIN register, and several
gate delays later you’ll have the
+ + +
answer
• Comparison
- The “all combinational” version of
these circuits are generally fewer + +
36
… And Another
30
A B
if (ShiftReg & 1) Ocount++;
comparator
eq
0101001000 ← A value
0101000111 ← That value -1 done
30 AND gates
FSM Datapath
clock
reset
inputs outputs
next state and ALUs, MUXes,
outputs inputs
output logic comparators, etc.
clock clock
reset reset 40
Notes on Hardware Thread Design
• Design the computational machinery (datapath)
separate from the control (FSM) machinery
- Keep control points and status points straight
- Make sure the FSM inputs status points and outputs control points
• Datapath should be structured as RTL
- Registers hold values
- At clock edges, values are transferred from a register, through
combinational circuitry, into a (usually different) register
- Might even list transformations during design as:
✦ Register A ➙ Register B
✦ Register C + Register D ➙ Register C
42