You are on page 1of 106

1

Digital System Design with PLDs and FPGAs


Advanced Digital System Design

Kuruvilla Varghese
DESE
Indian Institute of Science
Kuruvilla Varghese

Synchronous Sequential Circuit 2

• Structure
• Design
• Timing analysis

Kuruvilla Varghese

1
Synchronous Counter 3

• Mod 6 Counter (0, 1, 2, 3, 4, 5, 0, …)


NS
Next D
State PS
CK Q
Logic AR
Preset state Next state
Clock Q2 Q1 Q0 D2 D1 D0
Reset 000 001
001 010
Di = Fi (Q2, Q1, Q0)
NS = Funct (PS) … …
101 000
Mod 6 Counter (0, 5, 3, 2, 1, 4, 0 …) ?
Kuruvilla Varghese

Output Waveform 4

CLK tco

PS 0 1 2 3

• 3 tco, for Q2, Q1, Q0, Worst case is taken

Kuruvilla Varghese

2
Detailed and Next Level 5

D Q D Q D Q
Q2 Q1 Q0

CK CK CK
AR AR AR
CLK
RST

NS
Next D
PS
State
CK Q
Logic AR

CLK
RST

Kuruvilla Varghese

Mod - 6 Counter with UP-DN/ 6

NS
UP-DN/ Next D
State PS
CK Q
Logic AR

Clock
Reset

Kuruvilla Varghese

3
Mod - 6 Counter with input UP-DN/ 7

Inputs Preset state Next state Di = Fi (Q2, Q1, Q0, UP-DN/)


UP-DN/ Q2 Q1 Q0 D2 D1 D0
NS = Funct (PS, Inputs)
0 000 101
0 001 000 • Inputs
• Count-by-2
… … …
• Reset
0 101 100
• Skip-3
1 000 001
• Load, din 2:0
1 001 010
… … … • Synchronous Inputs
1 101 000 • Transferred to output on Clock
• Goes to the Input (D) of FF
Kuruvilla Varghese

Asynchronous 8

• Can we remove the Flip Flops and use buffers


instead ?

NS
Next D
PS
State 001
CK Q
Logic AR 011
010
CLK
RST

Kuruvilla Varghese

4
Asynchronous 9

• Yes
• Unbalanced Path delays
• Races
• Difficult to design / control
• Fast

Kuruvilla Varghese

Maximum Frequency 10

NS
Next
D PS
State
CK Q
Logic
AR Min Clock period / Max frequency
Tclk(min) > tco(max) + tcomb(max)
Clock
Reset
+ ts(max)
fmax < 1 / Tclk(min)
CLK
tco tcomb
slack = Tclk(min)– (tco(max) +
PS tcomb(max) + ts(max))
th ts th

NS Avoid Hold time Violation


Tclk
tco(min) + tcomb(min) > th(max)
Kuruvilla Varghese

5
Max Frequency / Hold time Violation 11

• Earlier we gave seen the basic timing parameter of combinational


circuit, i.e. tpd. Similarly basic timing parameters of flip-flops are ts,
th and tco/tcq. All the other timing relations/parameters of digital
circuits are built on these.
• For Sequential circuits, the basic timing parameters are minimum
clock period / maximum frequency and conditions for hold time
violation
• Our analysis makes an assumption that the clock reaches all the flip
flops at the same instant (i.e. there is no clock skew). We will
analyze the case with clock skews later.

Kuruvilla Varghese

Max Frequency / Hold time Violation 12

• When a minimum clock period condition is violated, this can be


met by increasing the clock period
• When there is a hold time violation you need to increase the
combinational delay.
• Since, in a flip flop the tco is greater than th, hold time violation
with our default assumption of no clock skew can’t happen.
• But, this can happen when there are clock skews, and could be
most probable where combinational delays are less or zero like
in shift register

Kuruvilla Varghese

6
Number of Paths 13

• In the case of Mod-6 counter there are 3 flip-flops. Total


number of probable register to register paths are 9
• i.e. From each Qi to each Dj for i, j: 0,1,2
• In general timing of any register to register path follows
the same pattern, it need not be synchronous counter
• e.g. Source register holding some data which goes to
combinational circuit for some computation, and the
output (result) from combinational circuit is registered in
a destination register
Kuruvilla Varghese

Register to Register Path 14

D Q D Q
Comb
CK CK

CLK

Tclk(min) = tco(max) + tcomb(max) + ts(max) + slack

tco(min) + tcomb(min) > th(min)

Kuruvilla Varghese

7
Setup, Hold Times with skew 15

2 ns
D’ D
D Q

CLK CK

2 ns 1 ns
ts th

CLK
2 ns
D
ts’ 4 ns
th’ -1 ns
D’

Kuruvilla Varghese

Setup, Hold Times with skew 16

• Most often, setup and hold times of flip-flops or registers with respect to a
pin or output of another register need to be analyzed.
• When there is a delay t in the path to D input, the setup time with respect to
new reference point D' is increased by t and hold time is decreased by t.
• In this case, hold time can take a negative value. A hold time of –t means
that at point D’, the data can be removed or changed t time before the active
clock edge.
• Note: Setup time is defined as time before clock, data has to be setup. So,
for setup time positive value is going backward from clock edge, and
negative value means it is forward from clock edge. For hold time reverse
case applies.

Kuruvilla Varghese

8
Setup, Hold Times with skew 17

D D Q
3 ns
CLK’ CK

2 ns 1 ns
ts th

CLK
3 ns

D
-1 ns
ts’
th’ 4 ns

CLK’

Kuruvilla Varghese

Setup, Hold Times with skew 18

• When there is a delay t in the path to CLK input, the setup


time with respect to new reference point CLK’ is
decreased by t and hold time is increased by t.
• In this case, setup time can take a negative value. A setup
time of –t means that at point CLK’, the data can be setup
t time after the active clock edge.

Kuruvilla Varghese

9
Flat Design: 60 Seconds Timer 19

7 Seg 7 Seg
LED LED

BCD- BCD-
7 Seg 7 Seg

BCD Mod 6
Clock Divider Counter Counter

POR

Kuruvilla Varghese

Design Issues 20

• Accuracy
– Clock Frequency
• Area
– Clock frequency, Divider
– BCD, Mod-6 or Mod-60 counter ?
• Timing
– Max frequency – Divider
• Electrical Specifications
– 7 Segment LED driving

Kuruvilla Varghese

10
CPU Specifications 21

• Example 8 bit Micro-processor


8 bit ALU, Data Bus D7 – D0
4, 8 bit registers
16 Instructions, (4 bit op code, 2 bits each for src & dst Register)
64 KB Address space, Address lines A15 – A0
Program counter 16 bit, Stack Pointer 16 bit
No separate IO space
Controller – hard wired (Not Micro-coded)
De-multiplexed Address and Data bus
1 Interrupt
Kuruvilla Varghese

CPU Design 22

• Partition: Functional blocks with interfaces


(signals)
• Top-down Design
• At each level
Specifications / Functional description
Timing specifications
Electrical specifications

Kuruvilla Varghese

11
Top-down Design 23

• Complex designs cannot be done in one shot, one need to


partition it to logical blocks, each of which may have to
be further partitioned, till one end up with basic blocks
like muxes, adders, incrementers, registers, decoders etc.
• This calls for domain knowledge for proper partitioning
and identifying the interfaces and to decide the timing
detail at the interfaces

Kuruvilla Varghese

CPU Level 0 24

CLK RD/

RST WR/
CPU
INTR A15:0

D7:0

• Functional: Multi-cycle Execution, Instruction set, …


• Timing: Bus Cycle, Interrupt, Clock, Reset, …
• Electrical: Bus driving

Kuruvilla Varghese

12
CPU Level 1 25

Data Bus D7:0

CLK

IR_L INST REG RA_L PC_IS


TR1_L TR1 TR2_L TR2 RA_E REG A SP_IS
PC_L0 SP_L0
RB_L
INST DEC RB_E REG B PC_L1 PC SP_L1 SP
RC_L
RC_E REG C PC_OS
SP_OS
AL_S
ALU RD_L REG D PC_E SP_E
AL_E RD_E
CLK
AL_S, AL_E
RST RA_L, RB_L, RC_L, RD_L AD_S
CONTR- RA_E, RB_E, RC_E, RD_E
INTR
OLLER IR_L, TR1_L, TR2, L
PC_IS, PC_L0, PC_L, PC_OS, PC_E A15:0

SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S

Kuruvilla Varghese

CPU Level 1 26

• Data Path
– Registers, Combinational Circuit
• Controller
– Finite State Machine (FSM)
– Registers, Combinational Circuit

Kuruvilla Varghese

13
Datapath 27

• Datapath is where data movement and computation happens. It usually


comprises of registers to hold the input/intermediate/final data values and
combinational circuits implementing all the computations
• In the case of CPU, Register file, ALU with registers at it input, Program
Counter block and Stack Pointer block forms the datapath
• Controller provides the timing or control signal; to enable various outputs of
registers, give latch signals to registers, specify the operation of combinational
circuit, to select various path through which data moves (through Muxes) etc.
• Controller does no computation, merely provides the timing signals
• This helps as to partition individual blocks, as at the point of partitioning
individual blocks we need not bother about sequence of operations or its timing,
we need to concentrate the functionality in terms of computation, data movement
etc.
Kuruvilla Varghese

Controller 28

• Controller does no computation, merely provides the


timing signals
• A single controller can provide the control signals for all
the blocks which work synchronously
• Separate controller is required, if, another block whose
operation is not synchronous to this block
• Also, even if all the blocks works synchronous to each
other, to manage complexity, multiple controllers or
hierarchy of controllers may be used
Kuruvilla Varghese

14
CPU Level 2: Registers 29

• Registers use flip-flops. Main control we require is to


enable the register to latch the input data, at proper
instant.
• Normally this done on an active clock edge qualified by a
level control signal from controller (i.e. Input data is
latched on the register, up on the active clock edge while
the control signal (e.g. RA_L) is high.
• Such a scheme allows the continuous latching of input
data if the control signal is kept high.
Kuruvilla Varghese

CPU Level 2: Registers 30

Data Bus D7:0

CLK

IR_L INST REG RA_L PC_IS


TR1_L TR2_L RA_E SP_IS
TR1 TR2 REG A
PC_L0 SP_L0
RB_L
INST DEC RB_E REG B PC_L1 PC SP_L1 SP
RC_L
RC_E REG C PC_OS SP_OS
AL_S
RD_L PC_E
AL_E ALU REG D SP_E
RD_E
CLK
AL_S, AL_E
RST RA_L, RB_L, RC_L, RD_L AD_S
CONTR- RA_E, RB_E, RC_E, RD_E
INTR
OLLER IR_L, TR1_L, TR2, L
PC_IS, PC_L0, PC_L, PC_OS, PC_E A15:0

SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S

Kuruvilla Varghese

15
CPU Level 2: Registers 31

D Q D7:0

RA_E

RA_L
CLK CK

8-bit Register (8, Edge triggered flip-flops)


8 Tri-state gates
1, 2-input AND gate
Note: There is a timing issue in this scheme
Kuruvilla Varghese

CPU Level 2: Registers 32

0
D Q D7:0
1

RA_L RA_E

CLK CK

8-bit Register (8, Edge triggered flip-flops)


8 Tri-state gates
1, 8-bit 2 to 1 Mulitplexer
Kuruvilla Varghese

16
Program Counter 33

• PC is incremented at every clock cycle, hence need a 16 bit


incrementer
• PC output drives the address bus along with stack pointer, a 16-
bit, 2 to 1 multiplexer is required
• On instructions like jump the address on the data bus has to be
loaded to PC. Data bus being 8-bit, this has to be done in 2
steps, that calls for PC register to be 2, 8-bit registers with
separate latch signals
• At the reset and on interrupt PC has to be loaded with specific
addresses, this calls for path selection at the input
Kuruvilla Varghese

Program Counter 34

• On instructions like call, the PC value has to go to data bus


(to memory), hence an 8-bit 2to 1 mux is required at the
output
• We don’t need to worry about the sequence of operations
or timing of the signals at this point, we need to identify
the data movement and various operations to be able to do
the next level design
• Sequencing and timing will be done while designing the
controller

Kuruvilla Varghese

17
CPU Level 2: Program Counter 35

D7:0
PC-RST
PC-INT
PC_L0
CLK PCS(0)
PC-IS
PC-IS
+1
PC_L PC_L
CLK PC(1) CLK PC(0)

From SP

AD-S PC-OS

A15:0 PC-E
D7:0
Kuruvilla Varghese

CPU Level 2: Program Counter 36

• 3, 8-bit Registers
• 2, 8-bit 4 to 1 Multiplexers
• 1, 16 bit Incrementer
• 1, 16-bit 2 to 1 Multiplexers
• 1, 8-bit 2 to 1 Multiplexers
• 8 Tri-state gates

Kuruvilla Varghese

18
Design 37

• Flat Design
• Top-down design
• Bottom-up Design

• Functionality
• Timing
• Electrical Characteristics
• Power Dissipation

Kuruvilla Varghese

Controller 38

Data Bus D7:0

CLK

IR_L INST REG RA_L PC_IS


TR1_L TR2_L SP_IS
TR1 TR2 RA_E REG A
PC_L0 SP_L0
INST DEC RB_L
REG B
RB_E PC_L1
PC SP_L1 SP
RC_L REG C PC_OS SP_OS
RC_E
AL_S
PC_E
AL_E ALU RD_L REG D SP_E
RD_E
CLK
AL_S, AL_E
RST RA_L, RB_L, RC_L, RD_L AD_S
CONTR-
OLLER RA_E, RB_E, RC_E, RD_E
INTR
IR_L, TR1_L, TR2, L
PC_IS, PC_L0, PC_L1, PC_OS, PC_E A15:0

SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S

Kuruvilla Varghese

19
Controller 39

• Instruction: Add A, B (A <= A + B)


– Move A to TR1
• Enable A output,
• Give latch signal to Register TR1, Disable A Output
– Move B to TR2
– Select ALU operation (Part of instruction can select
directly)
– Wait
– Move ALU output to Register A
Kuruvilla Varghese

Controller 40

• This identifies the macro steps, and micro steps in each


macro step
• A single pulse for signal RA_E can enable and disable
output of Register A
• Same pulse can be used to latch the data to register TR1
(TR1_L)

Kuruvilla Varghese

20
Controller Timing 41

CLK
RA_E, TR1_L

RB_E, TR2_L

AL_E, RA_L

Kuruvilla Varghese

Controller 42

• Can’t be combinational circuit


– Since, combinational circuit can not produce a
sequence at its output for just a single input as
instruction
• Sequential Circuit
– What type of sequential circuit can generate such
pulses?

Kuruvilla Varghese

21
Counter 43

• Can we use a Counter to generate the timing pulses ?


• Mod 3 Counter ?

NS Outputs
Inputs Next
D
PS Output
State
CK Q Logic
Logic AR

Clock
Reset

Kuruvilla Varghese

Output Logic 44

Pr State Outputs
Q1 Q0 RA_E TR1_L RB_E TR2_L AL_E RA_L

0 0 1 1 0 0 0 0
0 1 0 0 1 1 0 0
1 0 0 0 0 0 1 1

RA_E = f1 (Q1, Q0)


Similarly other outputs

Output = F (PS)
Kuruvilla Varghese

22
Controller Timing 45

CLK
RA_E, TR1_L

RB_E, TR2_L

AL_E, RA_L

Kuruvilla Varghese

Generic ? 46

• Can we generate whatever timing signals required ?

CLK
RA_E, TR1_L

RB_E, TR2_L

AL_E, RA_L

• Yes
RA_E = Q1/Q0/ + Q1Q0/ RB_E = Q1 EXOR Q0

Kuruvilla Varghese

23
State Assignment 47

• Is the counter (state changes) need to be ordered ?

NS Outputs
Inputs Next
D
PS Output
State
CK Q Logic
Logic AR

Clock
Reset

Kuruvilla Varghese

Generic ? 48

• Is the counter (state changes) need to be ordered ?


– We care only about Inputs, Outputs
– State Assignment
– Could be of some use
• What it means to Design an FSM ?
– Inputs, State Transitions, Outputs
– Next State Table, Output Table

Kuruvilla Varghese

24
FSM Idea 49

• Finite State Machine (FSM)


– A Counter goes through various states.
– States are decoded to generate various pulses to control
the data path. (e.g. select a mux, clock a latch, clock
enable to a register etc.)

Kuruvilla Varghese

Moore / Mealy 50

• Moore and Mealy outputs


– Some outputs are decoded from present state, and they
are called Moore outputs
– Some outputs are decoded from present state and inputs,
and they are called Mealy outputs

Kuruvilla Varghese

25
FSM: 3 Blocks view 51

Inputs NS Outputs
Next
D PS Output
State
CK Q Logic
Logic AR

Clock
Reset

NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)

Kuruvilla Varghese

FSM / Controller: 2 Blocks view 52

Outputs

Inputs
NS PS
Logic D
CK Q
AR

Clock
Reset

Kuruvilla Varghese

26
2 blocks 53

• If, you look at the diagram of FSM with 3 blocks, you can
see both Next State Logic and output logic use Present
State and Inputs to generate its output
• Hence we could view the FSM with 2 blocks where one
block is both logic combined
• Such a view is useful for timing analysis and HDL coding

Kuruvilla Varghese

Maximum Frequency 54
Outputs
Inputs
NS D
PS Tclk(min) > max (tcq(max) +
Logic CK Q tNSL(max) + ts(min),
AR
tcq(max) + tOL(max))
Clock
Reset
fmax < 1 / Tclk
CLK tcq tNSL

PS
slack = Tclk – (tcq(max) +
th ts th tNSL(max) + ts(min))
NS
tcq + tOL
tcq(min) + tNSL(min) > th(max)
Outputs

Kuruvilla Varghese

27
Timing 55

• Timing analysis is same as that of the synchronous counter, but


for maximum frequency of operation there are 2 category of paths
to consider; register to register and register to output
• In our analysis, we have considered register to output just till
output, but in real life one has to consider the end point, i.e. if it is
going to another register input through a combinational circuit,
then the whole path till the destination need to be considered
• Hold violation condition is same as in the case of synchronous
counter

Kuruvilla Varghese

Controller Design 56

• Control Algorithm
– When one try to implement a control algorithm using an FSM one
need to decide sequence of steps for various input combinations, and
the output at each step. This can be a textual description followed
with a waveform
– Aim of the design is to come out with the truth tables of NSL and
OL. This couldn’t be done easily from textual description and/or
waveform to truth tables
– Hence, we use a graphical tool called state diagram (like flow chart)
to visualize the sequence of states, their transition based on inputs,
and various outputs produced at each state.
Kuruvilla Varghese

28
State Diagram 57

• State Diagram
– States: Oval / Circle
– Transitions: Arrows
– Outputs: Output signal in a block associated with states
• Designing FSM
– Designing NSL, OL
– State Assignment

Kuruvilla Varghese

FSM: 3 Blocks view 58

Inputs NS Outputs
Next
D PS Output
State
CK Q Logic
Logic AR

Clock
Reset

NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)

Kuruvilla Varghese

29
State Diagram: States, Transition 59

State Conditional
S0 Transition
S0
en en/
Unconditional
Transition
S1 S2
S1

en/
S0

en Conditional
Transition

S1

Kuruvilla Varghese

State Diagram: Moore, Mealy Outputs 60

en/
Moore Outputs

S0 rd/ = 1, latch = 0

en

en/
Mealy Outputs
S1
rd/ = 0, latch = 1

S0 rd/ = 0, latch = en

en

S1 rd/ = 1, latch = 0

Kuruvilla Varghese

30
State Diagram: Example 61

start/
power_on

S0 prst = 0, shadct = 0,
mcmuld = 0, sel = 0

start

max S1 prst = 1, shadct = 0,


mcmuld = 1, sel = 0

max/

S2 prst = 0, shadct = 1,
mcmuld = 0, sel = r0

Kuruvilla Varghese

Next State Table 62

Inputs Present State Next State


start max Q1 Q0 D1 D0
0 X 0 0 0 0
1 X 0 0 0 1
X X 0 1 1 0
X 0 1 0 1 0
X 1 1 0 1 1

• Next State Table can be easily written by looking at inputs


and state transitions
Kuruvilla Varghese

31
Output Table 63

Present State Outputs

Q1 Q0 prst shadct mcmuld sel


0 0 0 0 0 0

0 1 1 0 1 0

1 0 0 1 0 r0

• Output table can be written looking at the outputs for each


state
Kuruvilla Varghese

State Diagram and Logic 64

• State diagram has all the information for Next state table
and Output Table
• If a state diagram is designed and coded the tools can
generate the tables and optimize it to implement it

Kuruvilla Varghese

32
ADC Controller 65

• Scenario
– Data Acquisition System
– ADC interfaced to Host processor
– Per sample interrupt costly for the host processor
– A temporary storage of samples
– A controller to control ADC and storage
– When storage is near full, interrupt the host

Kuruvilla Varghese

ADC Controller 66

ADC
Host interface

aclk
ain data data
soc eoc hrd/
intr
start
Temporary storage ??

Controller

Kuruvilla Varghese

33
Temporary storage 67

• DPRAM
– Random access is not required
– Only one way data flow
– Complex for the application, costly
• FIFO
– Simple to use (No address bus)
– Enough for the application

Kuruvilla Varghese

Block Schematic 68
Data path
ADC
Host interface
FIFO
aclk
ain data din dout data
frd/
soc eoc hrd/
fwr/ ¾ full
intr
start

Controller

clk soc
rst fwr/
start
eoc

Kuruvilla Varghese

34
Assumptions 69

• FIFO 3/4th Full can be used for host interrupt


• Host processor can read the full FIFO in short time through
burst read.
• Each time host reads a fixed number of samples (<= 75%),
last time may end up with some data in FIFO. Ignore it; may
have to use empty to completely read it.
• ADC ‘soc’ requires a narrow pulse
• fwr/ timing is to be met

Kuruvilla Varghese

Timing Diagram 70

start

soc

eoc

fwr/

Kuruvilla Varghese

35
fwr/ Timing 71

• How to generate ‘fwr/’ timing ?


• Controller goes through many states ?
– Too many states
– Modification is difficult
• Counter
– Counter counts to the required value
– Counter output is decoded to give it to FSM
– FSM controls the reset (sometimes enable also) of the counter

Kuruvilla Varghese

Complete Block Schematic 72

ADC Host interface


FIFO
aclk
ain data din dout data
soc eoc frd/ hrd/
fwr/ ¾ full
intr
start

Controller

clk soc cclk


rst fwr/ = tim
Counter
start crst
eoc
wtim

Kuruvilla Varghese

36
Complete Timing Diagram 73

start Conversion Time

soc
eoc

fwr/

crst

wtim

Kuruvilla Varghese

Control Algorithm 74

• Up on reset come to init state. Wait for start. Initialize


outputs (soc = 0, crst = 1, fwr/ = 1)
• Up on start, go to next state. Make soc = 1
• Transit to next state. Make soc = 0. Wait for eoc = 1
• Up on eoc = 1, transit to next state. Start the counter
(crst = 0), make fwr/ = 0. Wait for wtim = 1.
• Up on wtim = 1 go to init state

Kuruvilla Varghese

37
State Diagram 75

start/
rst
soc = 0, fwr/ = 1
S0 crst = ‘1’

wtim/ wtim start

soc = 0, fwr/ = 0 soc = 1, fwr/ = 1


S3 S1 crst = ‘1’
crst = ‘0’

eoc
soc = 0, fwr/ = 1
S2 crst = ‘1’

eoc/

Kuruvilla Varghese

State Assignment, Flip-Flops etc 76

• Four states, Binary encoding


• Number of flip-flops: 2
• State Assignment: Sequential
• S0: 00, S1: 01, S2: 10, S3: 11
• Type of Flip-Flops: D Flip flops

Kuruvilla Varghese

38
Finite State Machine (FSM) 77

Inputs NS Outputs
Next
State D PS Output
CK Q Logic
Logic AR

Clock
Reset

NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Output = f (PS, Inputs)

Kuruvilla Varghese

Next State Table 78

Inputs Present State Next State


start eoc wtim Q1 Q0 D1 D0
0 X X 0 0 0 0
1 X X 0 0 0 1
X X X 0 1 1 0
X 0 X 1 0 1 0
X 1 X 1 0 1 1
X X 0 1 1 1 1
X X 1 1 1 0 0

D1 = f1 (Q1, Q0, start, eoc, wtim) Equations


D2 = f2 (Q1, Q0, start, eoc, wtim) Minimization
Kuruvilla Varghese

39
Output Table 79

Present State Outputs


Q1 Q0 soc crst fwr/
0 0 0 1 1
0 1 1 1 1
1 0 0 1 1
1 1 0 0 0

soc = f1 (Q1, Q0) Equations


crst = f2 (Q1, Q0) Minimization
fwr/ = f3 (Q1, Q0)
Kuruvilla Varghese

Methodology 80

1. Specifications
2. Block schematic (Blocks, Signals)
– Data path, Controller(s)
3. System Timing Diagram
4. Sub-system Identification
5. Update Timing Diagram
6. Data path design (Various Levels)
7. Controller Algorithm
8. State Diagram
Kuruvilla Varghese

40
Methodology 81

9. State Diagram Optimization


10. State Assignment
11. Selection of Flip-flops
12. Next State Table, Equations, Minimization
13. Output Table, Equations, Minimization
14. Selection of Device Technology
15. Implementation
16. Test and Debug
17. Documentation
Kuruvilla Varghese

Methodology 82

• Steps 1-8: Designer


• Steps 9-13: Tool + Directives
• Steps 14-18: Tools + Designer

Kuruvilla Varghese

41
Power on Reset 83

Outputs
Inputs NS
Next
Sync D PS Output
Reset
State
CK Q Logic
Logic AR

Clock
Async Reset

• Use Asynchronous Reset,


• If not available, use synchronous Reset
Kuruvilla Varghese

FSM: Clock frequency 84

• Maximum Clock Frequency


– Delays of the blocks
• Max Clock frequency (Min Clock period)
Tclk(min) > Max ((tcq(max) + tNSL(max) + ts(max)), (tcq(max) + tOL(max)))

Kuruvilla Varghese

42
FSM: Minimum Clock frequency 85

CLK

IN1

IN2

IN3

CLK’

Kuruvilla Varghese

FSM: Minimum Clock frequency 86

• Minimum Clock frequency should be greater than the twice


the Maximum Input clock frequency
• Sampling the inputs
• Inputs may not be periodic waveform
• Pulse width shouldn’t be the criteria, can be stretched, How
fast to respond to the event should be the criteria.

Kuruvilla Varghese

43
Stretching the pulse 87

IN

CLK1

IN’

CLK2

Kuruvilla Varghese

Timing Pulse Accuracy 88

Timing Pulse
CLK1

CLK2

• To detect a pulse with certain accuracy, min clock period should


be less than the error

Kuruvilla Varghese

44
Stretching the pulse 89

IN

CLK1

IN’

CLK2

Kuruvilla Varghese

Pulse Stretcher 90
sdet

I D Q D Q D Q

CK CK CK
det
AR AR AR

rst
clk

det

clk

sdet

Kuruvilla Varghese

45
Pulse Stretcher 91

• Pulse act as a clock to catching flip-flop


• Next 2 flip-flops are clocked by FSM clock to
ensure a pulse of at least one clock period duration.
• Not very practical if you have to stretch long

Kuruvilla Varghese

Pulse detection 92

• Pulse to level converter


• Level to Pulse converter

Kuruvilla Varghese

46
Pulse to toggle 93

I
D Q

CK
P
AR

Kuruvilla Varghese

Level to pulse 94

I I1 I2 I3
D Q D Q D Q

CK CK CK

clk2

I2

clk2

l2 .l3/

l2/.l3

l2 xor l3

Kuruvilla Varghese

47
Pulse Transfer 95

Xor
I I1 I2 l3
D Q D Q D Q D Q

CK CK CK
P CK
clk2

Kuruvilla Varghese

Register to Register Path 96

D Q D Q
Comb
CK CK

CLK

Tclk(min) = tco(max) + tcomb(max) + ts(max) + slack

tco(min) + tcomb(min) > th(min)


Kuruvilla Varghese

48
Second Register clocked by CLK/? 97

• Bit naive question often asked


• We are minimizing the clock period looking at the critical path
delay
• If, we play with positive and negative clock edges, we have to make
sure alternate registers are clocked by CLK/.
• How can this be true when there is a feedback from a positive clock
triggered register to a positive clock edge triggered register control?
• What if the data from a positive edge triggered register and
negative edge triggered register goes to same register? How this
register can be clocked?
Kuruvilla Varghese

Second Register clocked by CLK/? 98

• What about signals from various parts of data path to a


controller?
• How can this system of clocking to be applied to an FSM?
• In the case of a mixed scenario, what if, the critical path
appears between registers clocked by opposite clock
edges?
Tclk(min) /2 = tco(max) + tcomb(max) + ts(max) + slack

Tclk(min) = 2 * (tco(max) + tcomb(max) + ts(max) + slack)


Kuruvilla Varghese

49
Second Register clocked by CLK/? 99

• And introducing the delay for inverter would mean


introducing skew to some part of the clock distribution
network, creating various timing problems
• In the most trivial case of a array of registers sandwiched
by combinational circuit, it does not matter if you use
same clock edges or alternate clock edges
• In the former case clock frequency would be twice as that
of the latter.

Kuruvilla Varghese

Moore / Mealy Output 100


Moore Output
rst
start/ Mealy Output
S0 soc = 0 rst
start/
start
S0 soc = start

S1 soc = 1
start

S2 soc = 0
S2 soc = 0

Kuruvilla Varghese

50
Moore / Mealy Output 101

clk

start

states S0 S0 S1 S2
Moore

soc
S0 S0 S2
Mealy states

soc

Kuruvilla Varghese

Mealy Output 102

• Comes earlier to Moore output


• Number of states are less
• Output timing depends on input timing; Glitches
• Hence, ideal when FSM and other blocks are
synchronous

Kuruvilla Varghese

51
FSM: Mealy Output 103

Synchr
onous
i1 O1: Mealy Output Sub-system

i2
FSM
clk O2: Mealy Output
Synchr
onous
Sub-system

O1 and O2 can be Mealy as function of states and i1 and/or i2

Kuruvilla Varghese

CPU Level 2: Registers 104

Data Bus D7:0

CLK

IR_L INST REG RA_L PC_IS


TR1_L TR2_L RA_E SP_IS
TR1 TR2 REG A
PC_L0 SP_L0
RB_L
INST DEC RB_E REG B PC_L1 PC SP_L1 SP
RC_L
RC_E REG C PC_OS SP_OS
AL_S
RD_L PC_E
AL_E ALU REG D SP_E
RD_E
CLK
AL_S, AL_E
RST RA_L, RB_L, RC_L, RD_L AD_S
CONTR- RA_E, RB_E, RC_E, RD_E
INTR
OLLER IR_L, TR1_L, TR2, L
PC_IS, PC_L0, PC_L, PC_OS, PC_E A15:0

SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S

Kuruvilla Varghese

52
Control of Sequential Circuits 105

FSM / Reg /
en (RA_L)
Contr- Counter /
oller Seq Ckt

clk

Kuruvilla Varghese

Clock Gating 106

D Q D7:0

RA_E

RA_L CLK’
CK
CLK

CLK

RA_L

CLK’

1 2
Kuruvilla Varghese

53
Clock Gating 107

• Two active clock edges


• In some cases, where the control signal register a data,
edge 1 may not meet the minimum clock period constraint
(i.e. it may not accommodate tco(max) + tcomb(max) + ts(max) ),
edge 2 may be late causing hold time violation.
• In cases, where control signal is used to increment some
counter, counter may get incremented twice, instead of
once.

Kuruvilla Varghese

Re-circulating Buffer 108

0
D Q D7:0
1

RA_L RA_E
CLK CK

CLK

RA-L

Register write on the clock edge

Kuruvilla Varghese

54
Re-circulating Buffer 109

• Any number of control signals


• Different data paths
– e.g. Parallel data, shifted data etc.
• Priority of control signals

Kuruvilla Varghese

Counter with enable 110

+1 1
d q
count
0 q
en
clk clk
rst
reset

• ‘en’ comes from FSM, en = 1 counter counts


otherwise retains the value.
Kuruvilla Varghese

55
VHDL Code 111

process (clk, rst)


begin
if (rst = '1') then q <= (others => '0’);
elsif (clk'event and clk = '1') then
if (en = ‘1’) then q <= q + 1;
end if;
end if;
end process;

Kuruvilla Varghese

Counter with enable and load 112


1 0

+1
q count
en d
q
1

din
load
clk
clk rst
rst

Kuruvilla Varghese

56
VHDL Code 113

process (clk, rst)


begin
if (rst = '1') then q <= (others => '0’);
elsif (clk'event and clk = '1') then
if (load = '1') then q <= din;
elsif (en = ‘1’) then q <= q + 1;
end if;
end if;
end process;
Kuruvilla Varghese

Re-circulating Buffer 114

0
D Q D7:0
1

RA_L RA_E
CLK CK

CLK

RA-L

Register write on the clock edge

Kuruvilla Varghese

57
Clock Gating 115

D Q D7:0

RA_E

RA_L CLK’
CK
CLK

CLK

RA_L

CLK’

1 2
Kuruvilla Varghese

Clock Gating 116

D Q D7:0

RA_E

RA_L CLK’
CK
CLK

CLK

RA-L

CLK’

1 2
Kuruvilla Varghese

58
Clock Gating for Low Power 117

D Q D7:0
RA_L CLK1
D Q
CK RA_E
CLK2
CK

CLK
CLK

RA_L
CLK1
CLK2

Kuruvilla Varghese

Clock Gating for Low Power 118

• Here, the requirement is to have a clock pulse with active clock


edge matching with the trailing edge of the original control signal
• This can be achieved, if the original control signal is delayed by
half clock period and this is gated with original clock. But delaying
by adding combinational logic delays would not be precise and also
would not allow flexibility in changing the clock frequency.
• Hence, the original control signal is resynchronized with negative
(opposite) clock edge and this resynchronized signal is gated with
clock to generate the control signal

Kuruvilla Varghese

59
Finite State Machine (FSM) 119

Outputs
Inputs NS
Next
State D PS Output
Logic CK Q Logic
AR

Clock
Reset

Kuruvilla Varghese

Finite State Machine (FSM) 120

Outputs

Inputs
NS D PS
Logic
CK Q
AR

Clock
Reset

Kuruvilla Varghese

60
State Diagram Optimization 121

• States Si and Sj are equivalent, if


– For the same input conditions, both states transit to
same next states (i.e. Number of transitions and the
conditions for each transition should match)
– For the same input conditions, both states produces the
same outputs (For Moore outputs, input conditions
does not matter)

Kuruvilla Varghese

State Diagram Optimization 122

• If Si and Sj are equivalent, one is redundant


• The rule is applicable to more than two states
• The first condition can be detected by examining the rows
(identical rows except the present state) of next state table.
• The second condition can be detected by examining the rows
(identical rows except the present state) of output table. (For
Moore outputs ignore inputs)

Kuruvilla Varghese

61
State Diagram Optimization 123

• In Next State Table look for same next states, Then out of these
next states, select the states for which input conditions are same.
• Or, look for same next states’ with same input conditions in one
shot
• Now, for these states check if outputs (for Mealy both input
conditions and outputs) are same
• Select the states where the outputs (with inputs for mealy) are
same
• These states are equivalent

Kuruvilla Varghese

Output Races (Glitches) 124

001 tcq0 > tcq1


tcq0 < tcq1

wr = 1 000 011 en = 1

010

clk
en
Note: Glitch could occur
wr either on ‘en’ or ‘wr’

Kuruvilla Varghese

62
Output Races (Glitches) 125

• When more than two flip-flops change Outputs during state


change, momentarily it could pass through transitory states
owing to variation in tcq. If these states produces some
outputs (different from source and destination states)
glitches could occur on these outputs.
• If these outputs are used for synchronous control it may not
affect the circuit. But, if it is used in asynchronous circuit
like RAM write it can cause problem.

Kuruvilla Varghese

Output Races: Solution 126

• Do state assignment such that more than one flip-flop


doesn’t change output (Gray Code)
• May not be possible
• Do state assignment such that the transitory states are the
ones that don’t produce any outputs.
• Register outputs. (Outputs are registered on the next clock
edge, well after the state change). Latency of one clock.

Kuruvilla Varghese

63
Output Registering 127
State FFs
Outputs Output FFs
Inputs Next NS D Q
D PS Output
State CK Q Logic
Logic AR CK AR
Reg
Clock Outputs
Reset

clk
en
en®
(valid output) ld
ld®

Kuruvilla Varghese

Output Registering 128

Reg
Outputs Outputs
Inputs D
Logic NS Q
CK AR PS

Clock
Reset

Kuruvilla Varghese

64
Selection of Flip-Flops 129

PS NS D J K T
0 0 0 0 X 0
0 1 1 1 X 1
1 0 0 X 1 1
1 1 1 X 0 0

In case of JK flip-flop, NSL has twice the number of outputs


compared to D or T, but because of don’t cares may result in
less logic for next state decoding. CPLDs and FPGAs has flip- flops that can be
used as D or T flip flops.

Kuruvilla Varghese

State Assignment 130

• Number of states = s
• Number of flip-flops = n = log 2 s 
• Number of possible ways to do the state assignment?
P(2n, s)
• e.g. s =17 n = 5 (Minimize Area)
P(2n, s) = 32! / (32-17)!
= 32 x 31 x … x 18 x 17 x 16 = 2.5…. x 1044
• NSL Minimization
• OL Minimization

Kuruvilla Varghese

65
NSL Optimization 131

• Our aim is to do the state assignment such that NSL is


minimized (minimum area)
• As we have seen a search through all possible assignments
to find the assignment which results in the minimum area is
almost impossible in terms of computation time
• Hence, we look for some Heuristic solution, where we
follow some sensible rules, that would result in near optimal
solution

Kuruvilla Varghese

NSL Optimization 132

• Where to look for insight?


– Next state table
– We think of Minterms grouping (as in Karnaugh Map)
• Aim would be to look for same next states (since, these state bits are
output of NSL, and that is what we want to minimize)
• For these bits we would like the minterms adjacent
• Minterms consists of input conditions and present state
• So let us look for next state with same input conditions, and make the
present state adjacent, such that, while grouping bits are removed
• This can be done in terms powers of 2

Kuruvilla Varghese

66
NSL Minimization 133

Inputs Present State Next State


I1 I2 I3 Q2 Q1 Q0 D2 D1 D0

1 1 0 0 1 0 1 0 1
1 1 0 0 1 1 1 0 1

Kuruvilla Varghese

NSL Minimization 134

• Under the same input conditions, if states Si and Sj transit


to same next state Sk, make state assignment such that the
states Si and Sj are logically adjacent.
• Applicable to more than two states (by powers of 2)
• Note: States Si and Sj may transit to different next states
under different input conditions. But, if both transit to just
one next state under same input conditions, is good
enough to make them adjacent, as the equations for that
state get minimized.
Kuruvilla Varghese

67
OL Minimization 135

Inputs Present State Outputs


I1 I2 I3 Q2 Q1 Q0 O2 O1 O0

1 1 0 0 1 0 1 0 1
1 1 0 0 1 1 1 0 1

Kuruvilla Varghese

OL Minimization 136

• (Under the same input conditions), if states Si and Sj


produces the same outputs, make state assignment such
that the states Si and Sj are logically adjacent.
• Applicable to more than two states (by powers of 2)
• Note: States Si and Sj may produce different outputs
(under different input conditions). But, if both produces
one output same (under same input conditions), is good
enough to make them adjacent, as the equations for that
output get minimized.
Kuruvilla Varghese

68
Fault Tolerance: Unused States 137

• Number of states = s
• Number of flip-flops = n = log 2 s
• Unused states 2n – s

• s = 5, n = 3
• Unused states = 23 – 5 = 3

Kuruvilla Varghese

Unused States 138

State Diagram Unused states


101

000
110
100

001 111

011

010

Kuruvilla Varghese

69
Unused States 139

• What happens if FSM get in to these states ?


– It could get stuck there
– It could loop through some or all of unused states
– It could get back to a valid/used state.
• If these states produces some outputs ?
• On what conditions above happens ?

Kuruvilla Varghese

NSL code: Unused states as Don’t cares 140

process (pr_state, i1, i2, ,)


begin
case pr_state is
when S0 => ,.
when S1 => ,..

when others => nx_state <= “---”;


end case;
end process;

Kuruvilla Varghese

70
Unused States 141

Inputs Present State Next State


I1 I2 I3 Q2 Q1 Q0 D2 D1 D0
1 x x 0 0 0 0 0 1

x x 1 1 0 0 0 0 0
x x x 1 0 1 X(1) X(0) X(1)

101 101

Kuruvilla Varghese

Unused States 142

Inputs Present State Next State


I1 I2 I3 Q2 Q1 Q0 D2 D1 D0
1 x x 0 0 0 0 0 1

x x 1 1 0 0 0 0 0
x x x 1 0 1 X(1) X(1) X(0)
x x x 1 1 0 X(1) X(1) X(1)
x x x 1 1 1 X(1) X(0) X(1)
101 110 111 101
Kuruvilla Varghese

71
Unused States: Fault Tolerance 143

101

000
110
100

001 111
011

010

Kuruvilla Varghese

NSL Block coding: Fault tolerance 144

process (pr_state, i1, i2, ,)


begin
case pr_state is
when S0 => ,.
when S1 => ,..

when others => nx_state <= S0;


end case;
end process;

Kuruvilla Varghese

72
Unused states 145

• Introduce transitions from unused states to a Safe


state.
• Safe state could be Init state.
• The safe state depends on the application, could be
something other than Init state.

Kuruvilla Varghese

Bring back to last state? 146

• One can think of redundancy for present state, but, that if


there is a mismatch to decide on which one is correct may
require majority voting
• Another way is to use error correcting codes (hamming
codes) for states, and use maximum likelihood decoding for
deciding on correct states, in case of wrong transition

Kuruvilla Varghese

73
Finite State Machine (FSM) 147

Outputs
Inputs NS
Next
State D PS Output
Logic CK Q Logic
AR

Clock
Reset

Kuruvilla Varghese

FSM: Output Delay 148

• Output delay: tcq + tol


• How to reduce the Output delay ?
– Decode Output from Next state but then has to be registered to
coincide with the state change. (i.e. next state to present state)
– Encode the Outputs in state bits (state variables)

Kuruvilla Varghese

74
Output decoding from Next State 149

D Q
Output Outputs
Logic CK
AR

Inputs Next NS
State D PS
Logic CK Q
AR

Clock
Reset

Output delay = tcq


Critical Path delay = tcq + tNSL + tOL + ts (*)

Kuruvilla Varghese

Output decoding from Next State 150

Outputs
Inputs NS
Next
State D PS Output
Logic CK Q Logic
AR

Clock
Reset

• In the normal case, we would have chosen clock period as tcq


+ tNSL + ts with some margin

Kuruvilla Varghese

75
Output decoding from Next State 151

• Compared to earlier case; from previous clock edge, total


delay would be tcq + tNSL + ts + tcq + tOL
• In the case of decoding output from Next State, delay would
be tcq + tNSL + tOL + ts + tcq
• But, Since NSL and OL are together, synthesis tool can
optimize (minimize) them combined and may result in less
delay than tNSL + tOL

Kuruvilla Varghese

Encoding Output in state bits 152

NS
Inputs D
Next Outputs
State
CK Q
Logic
AR PS

Clock
Reset

Output delay = t cq

Kuruvilla Varghese

76
Encoding Output in state bits 153

Outputs
States
WR/ EN
S0 0 1
S1 1 0
S2 1 1
S3 0 0
Q1 Q0

Since output patterns are unique and equal to number of states,


state variables can be used as outputs
Kuruvilla Varghese

Encoding Output in state bits 154

Outputs
States
WR/ EN
S0 0 1
S1 1 0
S2 1 1
S3 1 0
Q1 Q0

For states S1 and S3 outputs are same and hence one extra bit is
needed for state variables.

Kuruvilla Varghese

77
Encoding Output in state bits 155

Outputs Extra
States
WR/ EN bit
S0 0 1 0
S1 1 0 0
S2 1 1 0
S3 1 0 1
Q2 Q1 Q0

Adding the extra bit makes unique pattern and state variables can
be used as outputs.
Kuruvilla Varghese

Encoding Output in state bits 156

States Outputs Extra bits


Adr_1 Adr_0 WR/ EN
S0 0 0 1 0 0 0
S1 0 1 0 1 0 0
S2 0 0 1 0 0 1
S3 0 1 0 1 0 1
S4 0 0 1 0 1 0
S5 1 0 0 1 0 0
S6 1 1 0 1 0 0
Q5 Q4 Q3 Q2 Q1 Q0

Kuruvilla Varghese

78
Encoding Output in state bits 157

• Identify states with same output values,


• From these identify the states where one output pattern repeats
maximum,
• Add additional bits to make these output patterns distinct.

Kuruvilla Varghese

Metastability in edge triggered Flip-Flop 158

D Q
ts: Setup time: Minimum time
input must be valid before
CLK the active clock edge
th: Hold time: Minimum time
input must be valid after the
CLK
active clock edge
D ts tco: Propagation delay for
th input to appear at the output
Q from active
tco clock edge

Kuruvilla Varghese

79
Metastability 159

• If setup/hold time is violated, flip-flop can sample inputs


wrongly i.e. output could be ‘1’ or ‘0’ (it may remain at
previous value or transit to new value).
• The output may take long time to resolve, if the input
changes close to clock edge
• In the worst case, output can get stuck in between valid logic
levels, and can remain in such a state for indeterminate
amount of time

Kuruvilla Varghese

Datapath / Sequential Circuits 160

• We build data paths and controllers (FSMs) using flip-flops


(registers) and combinational circuits
• We make sure that in register to register paths, setup time is
met by choosing proper clock period
• We also analyze the conditions for hold time violation and
avoid the violation
• In all these, skew of the clock is also considered for worst
case design.

Kuruvilla Varghese

80
Dataptah 161

D Q D Q
Comb
CLK CLK
clk

Min Clock period / Max frequency


Tclk(min) > tco(max) + tcomb(max) + ts(max)

Avoid Hold time violation


tco(min) + tcomb(min) > th(min)

Kuruvilla Varghese

Sequential Circuit: FSM 162

outputs
inputs
NS
Flip
Logic
Flops PS

clock
reset

Min Clock period / Max frequency


Tclk(min) > tco(max) + tcomb(max) + ts(max)

Avoid Hold time violation


tco(min) + tcomb(min) > th(min)

Kuruvilla Varghese

81
Metastability in Sequential Circuits 163

• We take care of the setup time and hold time violation in


register to register paths in sequential circuits
• When can Metastability happens in datapath/Sequential
Circuits?

D Q D Q
Comb
CLK CLK

clk

Kuruvilla Varghese

Asynchronous Inputs 164

inputs D Q D Q
Comb
CLK CLK
clk

inputs outputs
NS PS
Flip
Logic
Flops

clock
reset

Kuruvilla Varghese

82
Asynchronous Inputs 165

• Asynchronous inputs to a sequential circuit can cause


metastability in Flip-Flops.
• Asynchronous inputs:
– Outputs from Flip-Flops working on a different clock
– Outputs generated by some process not synchronized to the
sequential circuit clock
• How to solve the problem ?
• Synchronize the input to the sequential circuit clock.
Synchronizer

Kuruvilla Varghese

Single Stage Synchronizer 166

ainp D Q D Q
Comb
CLK CLK

clk
Synchronizer Sequential Circuit

CLK

D
tco
Q

tclk > tco + tcomb + tsetup


Kuruvilla Varghese

83
Single Stage Synchronizer 167

• Asynchronous input is synchronized to the active clock edge and


appear after delay tcq, if clock period is chosen properly it will
meet the setup time at the next clock edge.
• Latency of one clock cycle for the flip-flops to sample the
synchronized input.
• Synchronizing flip-flop samples the data at one point, system FFs
sample the data at different points, owing to difference in path
delays.
• Synchronizing flip-flop can get in to metastability, but problem is
isolated to synchronizing flip-flop.

Kuruvilla Varghese

Single Stage Synchronizer 168

• But, if it comes out of metstability by next clock edge (with


margin) it is fine.
• Also, we assume input remains valid for one more clock cycle to
be correctly sampled and captured by the synchronizing flip-flop
at next clock.

Kuruvilla Varghese

84
Single Stage Synchronizer 169

ainp D Q D Q
Comb
CLK CLK

clk
Synchronizer Sequential Circuit

• The probability of a flip-flop remaining in metstability decreases


exponentially with time.

Kuruvilla Varghese

Metstability Resolution Time. 170

• Metstability resolution time (tr) for single stage synchronizer


tr = tclk – tcomb – ts
• How to increase tr ?
• Can we make tcomb zero ?
• Double stage Synchronizer

Kuruvilla Varghese

85
Double Stage Synchronizer 171

ainp D Q D Q D Q
Comb
CLK CLK CLK

clk

Synchronizer Sequential Circuit

• tr = tclk – ts
• Latency of two clock period

Kuruvilla Varghese

Further reducing tr 172

• If tr to be increased further ?
• We need to then increase tclk, but then the system throughput
would come down
• So, let us keep the system clock at same frequency and reduce the
clock of synchronizer, by dividing the system clock
• Multiple cycle synchronizer

Kuruvilla Varghese

86
Multiple Cycle Synchronizer 173

ainp
D Q D Q D Q
Comb
CLK CLK CLK

Mod-n clk
Counter

Synchronizer Sequential Circuit

• tr = n .tclk – ts
• Latency = 2 . n . tclk
• n = 2 or 3

Kuruvilla Varghese

Multi-cycle Synchronizer with de-skew FF 174


De-skew FF

ainp D Q D Q D Q D Q
Comb
CLK CLK CLK CLK

Mod-n
Counter
clk

Synchronizer Sequential Circuit

• Double stage synchronizer output will have additional skew of Mod-n


counter output delay resulting in larger clock period.
Tclk – tskew> tcq + tcomb + ts
• This is de-skewed by de-skew flip-flop
Tclk – tskew> tcq + ts
Kuruvilla Varghese

87
Cascaded Synchronizer 175

ainp
D Q D Q D Q D Q
Comb
CLK CLK CLK CLK

clk

Synchronizer Sequential Circuit

• Probability of metastable output reduces multiplicatively, at each


stage of cascaded synchronizer.

Kuruvilla Varghese

Asynchronous Inputs to FSM 176

EN/ EN/
00 00

01 EN 10 EN

11 01

• If EN is an asynchronous input state transitions can go to


unintended state
• Solution: Gray code assignment

Kuruvilla Varghese

88
Asynchronous Inputs to FSM 177

00 EN/
EN EN/ 00

01 10 EN

11 01

• Don’t branch to more than one state on an asynchronous input.


• Always use Go No-Go structure with gray code assignment
• Even better synchronize all asynchronous inputs

Kuruvilla Varghese

Reset Recovery Time (tREC, tRR) 178

D Q
CK
clock AR

reset

• Reset Recovery time is the minimum amount of time between the


de-assertion of reset and the next rising clock edge, for the proper
sampling of ‘D’ input Flip-flop.

Kuruvilla Varghese

89
Reset Recovery Time (tREC, tRR) 179

• One way to meet tRR is to synchronize asynchronous reset input to


flip-flop, then the reset behaviour would be same as that of
synchronous reset, as the reset happens at the clock edge.
• To retain the asynchronous reset behaviour only the trailing edge
of the asynchronous reset should be synchronized.

Kuruvilla Varghese

Asynchronous Reset 180

rst To asynch
resets of
POR D Q D Q FFs

CK CK

clk

Kuruvilla Varghese

90
Asynchronous Reset 181

rst/ To async
resets of
FFs
POR D Q D Q

CK CK

clk

Kuruvilla Varghese

182

Digital System Design with PLDs and FPGAs


Case Study

Kuruvilla Varghese
DESE
Indian Institute of Science
Kuruvilla Varghese

91
Multiplier: Algorithm 183

Multiplicand 1 0 1 1 x
Multiplier 1 1 0 1
------------------
Partial products 1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1
-----------------------------
1 0 0 0 1 1 1 1
-----------------------------
Kuruvilla Varghese

8-bit Multiplier: Issues 184

• Algorithm: Shift and Add


• 8 partial products – 7 Adders ?
– Accumulator
– Shift Accumulator right
• Resource sharing
– Multiplier & LSB of result
• Add and Shift in one clock cycle
• If multiplier bit is ‘0’ re-circulate result with shift
Kuruvilla Varghese

92
Resources 185

• Multiplicand Register, 8 bit


• Result Register (Multiplier), 16 bit
• 9 bit Adder
• 9 bit, 2 to 1 Multiplexer
• Bit counter (Mod-8 / 3-bit)
• Controller (FSM)

Kuruvilla Varghese

mc7:0
clk Multiplier: Data Path 186
rst MCND REG
load

md7:0 r15:8

ADD

0 r15:8 su8:0

sel 0 1 ml7:0
s0
s8:1
clk
clk L.PROD / MULT rst
prst H. PROD REG REG load
shift shift

r15:8 r7:0

93
Counter 187

clk
prst count2:0 Decoder max
Counter
shift

Kuruvilla Varghese

Controller 188

clk prst
rst load
Controller
start shift
r(0) sel
max done

Kuruvilla Varghese

94
Multiplicand Register (MCND) 189

mcndreg: process (clk, rst)


begin
md md if (rst = '1') then
mc
D Q md <= (others => '0');
load elsif (clk'event and clk = '1') then
clk CK if (load = '1') then
rst AR
md <= mc;
end if;
end if;
end process mcndreg;

Kuruvilla Varghese

H Product Register 190

hprodreg: process (clk, prst)


r15:8 r15:8 begin
D Q
if (prst = '1') then
s8:1
r(15 downto 8) <= (others => '0');
shift elsif (clk'event and clk = '1') then
clk CK if (shift = '1') then
prst AR r(15 downto 8) <= s(8 downto 1);
end if;
end if;
end process hprodreg;

Kuruvilla Varghese

95
L. PRODUCT / MULT Register 191

mulreg: process (rst, clk)


shift begin
if (rst = '1') then
r7:0 r(7 downto 0) <= (others => '0');
elsif (clk'event and clk = '1') then
s0 & r7:1 r7:0 if (load = '1') then
D Q
ml7:0 r(7 downto 0) <= ml;
elsif (shift = '1') then
load r(7 downto 0) <= s(0) & r(7
clk
CK
downto 1);
rst AR end if;
end if;
end process mulreg;

Kuruvilla Varghese

Counter 192

-- Counter
counter: process (clk, prst)
count
begin
+1 if (prst = '1') then
D
count count <= (others => '0');
Q elsif (clk'event and clk = '1') then
if (shift = '1') then
shift count <= count + 1;
clk CK end if;
prst AR end if;
end process;

Kuruvilla Varghese

96
FSM: 3 Blocks view 193

Inputs NS Outputs
Next
D PS Output
State
CK Q Logic
Logic AR

Clock
Reset

NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)

Kuruvilla Varghese

FSM / Controller: 2 Blocks view 194

Outputs

Inputs
NS PS
Logic D
CK Q
AR

Clock
Reset

Kuruvilla Varghese

97
State Diagram 195
power_on
start/

prst = 1, shift = 0,
S0 load = 0, sel = 0,
done = 0
start
start
start/

prst = 0, shift = 0, prst = 1, shift = 0,


S3 S1 load = 1, sel = 0,
load = 0,
sel = 0, done = 1 done = 0

max max/

S2 prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
Kuruvilla Varghese

mc7:0
clk Multiplier: Data Path 196
rst MCND REG
load

md7:0 r15:8

ADD

0 r15:8 su8:0

sel 0 1 ml7:0
s0
s8:1
clk
clk L.PROD / MULT rst
prst H. PROD REG REG load
shift shift

r15:8 r7:0

98
Multiplier: VHDL Code 197

library ieee; architecture arch_mult8 of mult8 is


use ieee.std_logic_1164.all; type statetype is (s0, s1, s2, s3);
use ieee.std_logic_unsigned.all; signal pr_state, nx_state: statetype;
signal prst, max, load: std_logic;
entity mult8 is port signal sel, shift: std_logic;
(clk, rst, start: in std_logic; signal md: std_logic_vector(7 downto 0);
done: out std_logic; signal su, s: std_logic_vector(8 downto 0);
mc, ml: in std_logic_vector(7 downto 0); signal count: std_logic_vector(2 downto 0);
prod: out std_logic_vector(15 downto 0)); signal r: std_logic_vector(15 downto 0);
end entity; begin

Kuruvilla Varghese

Multiplier: VHDL Code 198

-- Multiplier cum Lower Product Register


-- Multiplicand Register mulreg: process (rst, clk)
mcndreg: process (clk, rst) begin
begin if (rst = '1') then
if (rst = '1') then md <= (others => '0'); r(7 downto 0) <= (others => '0');
elsif (clk'event and clk = '1') then elsif (clk'event and clk = '1') then
if (load = '1') then if (load = '1') then
md <= mc; r(7 downto 0) <= ml;
end if; elsif (shift = '1') then
end if; r(7 downto 0) <= s(0) & r(7 downto 1);
end process mcndreg; end if;
end if;
end process mulreg;

Kuruvilla Varghese

99
Multiplier: VHDL Code 199

-- 9 bit 2-to-1 Multiplexer -- prod output


s(8 downto 0) <= '0' & r(15 downto 8) prod <= r;
when sel = '0' else su(8 downto 0); -- Counter
counter: process (clk, prst)
-- Higher Product Register begin
hprodreg: process (clk, prst) if (prst = '1') then
begin count <= (others => '0');
if (prst = '1') then elsif (clk'event and clk = '1') then
r(15 downto 8) <= (others => '0'); if (shift = '1') then
elsif (clk'event and clk = '1') then count <= count + 1;
if (shift = '1') then end if;
r(15 downto 8) <= s(8 downto 1); end if;
end if; end process;
end if; -- Max decoder
end process hprodreg; max <= '1' when (count = 7) else '0';
Kuruvilla Varghese

Multiplier: VHDL Code 200

-- Adder when s1 =>


su <= ('0' & md) + ('0' & r(15 downto 8)); prst <= '1'; load <= '1'; shift <= '0';
sel <= '0'; done <= '0';
-- FSM, Next state Logic, Output Logic nx_state <= s2;
connsl: process (pr_state, start, r(0), max) when s2 =>
begin prst <= '0'; load <= '0'; shift <= '1';
case pr_state is sel <= r(0); done <= '0';
when s0 => if (max = '1') then nx_state <= s3;
prst <= '1'; load <= '0'; shift <= '0'; else nx_state <= s2;
sel <= '0'; done <= '0'; end if;
if (start = '1') then nx_state <= s1;
else nx_state <= s0;
end if;

Kuruvilla Varghese

100
Multiplier: VHDL Code 201

when s3 => -- FSM Flip Flops


prst <= '0'; load <= '0'; shift <= '0'; conff: process (rst, clk)
sel <= '0'; done <= '1'; begin
if (start = '1') then nx_state <= s1; if (rst = '1') then
else nx_state <= s3;
pr_state <= s0;
end if;
elsif (clk'event and clk = '1') then
when others =>
prst <= '0'; load <= '0'; shift <= '0'; pr_state <= nx_state;
sel <= '0'; done <= '0'; end if;
nx_state <= s0; end process;
end case; end arch_mult8;
end process;

Kuruvilla Varghese

State Diagram 202


power_on
start/

prst = 1, shift = 0,
S0 load = 0, sel = 0,
done = 0
start
start
start/

prst = 0, shift = 0, prst = 1, shift = 0,


S3 S1 load = 1, sel = 0,
load = 0,
sel = 0, done = 1 done = 0

max max/

S2 prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
Kuruvilla Varghese

101
Multiplier: VHDL Code version 2 203

-- Components with clk, prst -- Counter


subreg1: process (clk, prst) if (shift = '1') then count <= count + 1;
begin end if;
if (prst = '1') then end if;
-- HPROD Register, Counter clear end process subreg1;
r(15 downto 8) <= (others => '0');
count <= (others => '0');
elsif (clk'event and clk = '1') then
-- Higher Product Register
if (shift = '1') then
r(15 downto 8) <= s(8 downto 1);
end if;

Kuruvilla Varghese

Multiplier: VHDL Code version 2 204

-- Components with clk, rst -- Multiplier cum Lower Product Register


subreg2: process (clk, rst) if (load = '1') then
begin r(7 downto 0) <= ml;
if (rst = '1') then md <= (others => '0'); elsif (shift = '1') then
r(7 downto 0) <= (others => '0'); r(7 downto 0) <= s(0) & r(7 downto 1);
elsif (clk'event and clk = '1') then end if;
-- Multiplicand Register end if;
if (load = '1') then md <= mc; end process subreg2;
end if;

Kuruvilla Varghese

102
Xilink Spartan 6 Atlys Board 205

Kuruvilla Varghese

206

Kuruvilla Varghese

103
Input/Output 207

VDD

prod(15:8) ml(3:0)

0
7 LEDs
prod(7:0)
1

0.125 Hz
27:0 start
15
CLK DIV
VDD
50
Mhz

Kuruvilla Varghese

Extra VHDL Code 208

entity mult8 is port(clk, rst, st: in std_logic; -- 50 MHz -> 0.25 Hz


done: out std_logic; disp: out constant termdcount: std_logic_vector(27
std_logic_vector(7 downto 0)); downto 0) := X"BEBC200";
end entity signal dcount: std_logic_vector(27 downto 0);
signal dclk: std_logic;
-- multiplicant, multiplier
signal mc, ml: std_logic_vector(7 downto 0); -- Multiplicant, Multiplier assignment
mc <= X"A5"; ml(7 downto 4) <= X"4";
-- start pulse ml (3 downto 0) <= mli;
signal dst, start, stclk: std_logic;

Kuruvilla Varghese

104
Extra VHDL Code 209

-- Clock for LED multiplexing -- LED Multiplexing


-- 50 Mhz to 0.125 Hz clock divider. disp <= r(7 downto 0) when dclk = '1' else r(15
dispcount: process (rst, clk) downto 8);
begin stclk <= dcount(15);
if (rst = '1') then dcount <= (others => '0'); stpulse: process (rst, stclk)
dclk <= '0'; begin
elsif (clk'event and clk = '1') then if (rst = '1') then dst <= '0';
if (dcount = termdcount) then elsif (stclk'event and stclk = '1') then dst <=
dcount <= (others => '0'); dclk <= not(dclk); st;
else end if;
dcount <= dcount + '1'; end process stpulse;
end if; start <= st and not(dst);
end if;
end process dispcount;

Kuruvilla Varghese

Extra VHDL Code 210

-- Clock for LED multiplexing -- LED Multiplexing


-- 50 Mhz to 0.125 Hz clock divider. disp <= r(7 downto 0) when dclk = '1' else r(15
dispcount: process (rst, clk) downto 8);
begin stclk <= dcount(15);
if (rst = '1') then dcount <= (others => '0'); stpulse: process (rst, stclk)
dclk <= '0'; begin
elsif (clk'event and clk = '1') then if (rst = '1') then dst <= '0';
if (dcount = termdcount) then elsif (stclk'event and stclk = '1') then dst <=
dcount <= (others => '0'); dclk <= not(dclk); st;
else end if;
dcount <= dcount + '1'; end process stpulse;
end if; start <= st and not(dst);
end if;
end process dispcount;

Kuruvilla Varghese

105
State Diagram 211
power_on
start/

prst = 0, shift = 0,
S0 load = 0, sel = 0,
done = 0
start/
start
start

prst = 0, shift = 0, prst = 1, shift = 0,


S3 S1 load = 1, sel = 0,
load = 0,
sel = 0, done = 1 done = 0

max max/

S2 prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
Kuruvilla Varghese

106

You might also like