Demystifying Resets Synchronous Asynchronous and Other Design Considerations Part 2

Demystifying Resets: Synchronous, Asynchronous and
other Design Considerations... Part 2

forums.xilinx.com/t5/Adaptable-Advantage-Blog/Demystifying-Resets-Synchronous-Asynchronous-and-
other-Design/ba-p/887366
6 сентября 2018
г.
In my previous blog post we discussed why proper planning is needed is for resets. Let
us continue the discussion. In this blog we will examine techniques for combining
multiple resets, sequencing resets across hierarchies and clock domains, the different
types of flops available in Xilinx FPGAs and finally we will look at a few tips and tricks for
handling resets in Vivado.
Combining Multiple Resets
Sometimes it is necessary to combine resets especially when there are multiple resets
that could be active at the same time or at different times. For example, a state machine
that needs to go back to the default state if any one reset from say a software reset or
the system reset or the power-on reset is active. It is possible that these resets are
synchronous or asynchronous. The first recommendation would be to synchronize the
resets if they asynchronous. Please refer to Part 1 of this blog to learn about
synchronizing asynchronous resets. Remember to synchronize all asynchronous resets
in the destination clock domain of interest first. The second recommendation would be
drive the reset to the destination from a flip-flop instead of a LUT. Driving reset from a
LUT may result in glitches when one of the reset is active so it is possible that the design
may not work reliably. Driving the reset from a flip-flop allows the tool to replicate the
register if timing is critical or if there is a huge fanout.
Fig.1: Bad way to combine resets as the destination is driven from a LUT outout
1/9
Fig. 2: The recommended way to combine resets
Fig. 1 shows three resets being combined and driving the reset of a register from a LUT.
Fig.2 shows the recommended method to combine resets. Once again, if the three resets
are asynchronous resets, then synchronize the resets first, combine them, register the
combined reset before hooking the combined reset to the rest of the design.
Reset Sequencing and Clock Domain Crossing
At the planning stage, a strategy about how you want your system (multiple chips) or
your design (all the hierarchies within the FPGA) to behave when the reset is active must
be clearly spelled out. Do you want all the chips (or all the design hierarchies in the
FPGA) going into the reset state and coming out of the reset state simultaneously? Or do
the multiple chips (or design hierarchies in the FPGA) require some kind of reset
sequencing? Depending on what the system architecture demands are, the reset
sequencing needs to be designed accordingly.
Assume that there is one system wide reset that is asynchronously asserted. Let us see
how to handle the reset in the two scenarios above:
All the design hierarchies are reset simultaneously
In this case assume that in your design, you have three clock domains a 100 MHz
core clock domain in Block A, a 166 MHz system clock domain in Block B and a 66
MHz PCI clock domain Block C. A single asynchronous reset, areset, that resets the
three clock domains simultaneously. Fig. 3. shows the block diagram, Fig. 4 shows
the schematic of the reset architecture. At the top level since areset is
asynchronous it is synchronized in the each of the three clock domains. Once it has
been synchronized in each of the clock domains it simultaneously drives the reset
in that hierarchy. At the hierarchical level, the synchronized reset signal should be
registered one or more times if the reset will have a huge fanout. Adding the
register(s) will help the placer to replicate the register in order to meet timing. At
2/9
the top level the asynchronous reset, areset, is constrained as a false path. At the
hierarchical level, since the reset is synchronous no timing constraints are needed.
Fig. 3: Block Diagram for an asynchronous reset driving the entire design simultaneously
Fig. 4: Schematic for an asynchronous reset driving the entire design simultaneously
3/9
The reset needs to be sequenced based on clock domains
In this case, let us assume that the 100 MHz core clock in Block A needs to come out of
reset first, followed by the 166 MHz system clock in Block B and finally the 66 MHz PCI
clock in Block C. The block diagram is shown in Fig. 5 and the schematic is shown in Fig.
6.
Fig. 5: Block Diagram for reset sequencing
Fig. 6: Schematic for reset sequencing
4/9
Notice that the top level, the asynchronous reset signal drives only Block A but all the
blocks will be at the reset state simultaneously. But the blocks will be come out of reset
sequentially. First Block A comes out of reset (the number of clock cycles determined by
the synchronizer chain), followed by Block B and finally block C. Depending on the fanout
of the reset signal within each blocks add registers to the synchronizer chains for easy
replication. Within each clock domain, once the asynchronous reset has been
synchronized, the reset has to be a synchronous style reset. The asynchronous reset
input port, areset, can be a false path constraint. Timing within each clock domain will be
analyzed as it is synchronous.
The types of Flop-flops available in Xilinx FPGAs
In Xilinx FPGAs, there are primarily types of Flop-flops that are available based on the
control set which is made up of the clock, the data, the enable and a set/reset/preset or
clear:
1. FDCE: This is flip-flop has 4-pins, the Clock pin, the D-pin, the Enable pin and an
asynchronous clear pin. Use this flop for asynchronous reset synchronization that
are active low.
2. FDRE: This is flip-flop has 4-pins, the Clock pin, the D-pin, the Enable pin and an
synchronous reset pin. Use this flop for synchronous resets.
3. FDSE: This is flip-flop has 4-pins, the Clock pin, the D-pin, the Enable pin and an
synchronous set pin. Use this flop for synchronous sets
4. FDPE: This is flip-flop has 4-pins, the Clock pin, the D-pin, the Enable pin and an
asynchronous preset pin. Use this flop for asynchronous reset synchronization that
are active high.
Handling Resets in Vivado
Control Sets: A Primer
Disclaimer: This blog is not intended to be a comprehensive guide to control sets. Please
refer to UG949: The Ultrafast Design Methodology Guide for more details. The aim of this
blog is introduce a few important concepts.
In FPGA's control sets are the clocks, resets, enables and sets (presets). For two flops to
be placed within the same slice the control sets have to match. If the control sets don't
match, then the placer will have spread the logic over the FPGA. Spreading the logic
might result in timing closure challenges. It some cases a high number of control sets
might result in Global, Long or Short congestion which might also result in timing closure
challenges.
While designing FPGAs, it is recommended to limit the total number of control sets in the
design. In Xilinx devices the acceptable number of control sets is between 7.5-15% of the
total number of available slices in the device. Please refer to UG949: The Ultrafast Design
Methodology guide for more information and your specific Device User Guide to find out
the total number of available slices. In this blog we will only focus on resets.
5/9
Like I have already discussed in the previous blog, resets and clock enables tend to be
the two control signals(apart from clocks, of course) that dominate designs which are
most likely culprits for high-fanout nets and possibly congestion. If there are too many
high-fanout nets in the design (greater than 15% of the available slices in the device)
they tend to cause global congestion which can manifest as a timing closure problem.
One way to mitigate global congestion is promote non-timing critical high-fanout nets to
global clocks. Promoting a high-fanout net like a reset (resets are typically active for
many clock periods and aren't timing critical in general) to a global clock (insert a BUFG)
not only frees up routing resources for the timing critical signals in your design but also
reduces congestion. The recommendation would be promote non-timing nets that have
a fanout greater than 25k to a global clock. You can promote a non-timing critical high-
fanout signal to a global using the following XDC constraint:
set_property CLOCK_BUFFER_TYPE BUFG [get_nets netName]
Timing critical high-fanout nets can be replicated with a KEEP attribute to aid in timing
closure. It might be necessary to replicate the high-fanout net in each SLR.
Another trick to manage reduce control sets in the designs would be to merge
reset/clock enable to the datapath. Suppose you had two flops one with a reset and
another without reset and both clocks are on the same clock domain. Since the control
sets don't match, the two flops cannot be placed in the same slice. By merging the reset
(and clock enables) to the datapath, the control sets would match enabling the
placement of flops in the same slice. You can add RTL attributes as shown in Fig. 9
below:
Fig. 9: RTL Attribute to control whether reset is hooked to the 'R' or 'CLR' pin or merged
with the datapath
You can also merge the reset (or enable) to the datapath by using the XDC constraint as
shown below:
6/9
set_property extract_reset "no" [get_cells top/synced_reset]
Handling High-fanout Nets
Earlier in the blog, I mentioned that replication will be needed depending on the fanout
of the net - the more the fanout, the more critical the timing, the more the need to
replicate the driver of the high fanout net. Having said that, in the synthesis stage,
aggressive replication by the user can be counter-productive. This is because, the
placement, routing and timing is unknown so aggressive replication will in all likelihood
hurt more than it helps. The recommendation is to avoid MAX_FANOUT constraint in the
synthesis stage on a global level. It is perfectly fine to have a few really targeted
MAX_FANOUT constraints in a few blocks with a value of 512/1024 if absolutely
necessary and only on nets that are expected to have a high fanout. In the opt_design
stage, again since placement, routing and timing is not known, limited tool driven
replication is recommended. In the placer, since we know exactly where the
cells/hierarchies are placed and the timing is little realistic, tool driven mid-grained
fanout optimization is recommended. After the router has finished the timing is accurate
so in this stage you want tool driven fine grained replication based on where the driver is
placed and where the destination register is placed and what the routing topology for
the net is.
Fanout Optimization in Placer
In Vivado version 2018.1, we introduced a new feature in the placer that automatically
manages high-fanout nets. The new and improved algorithm automatically inserts
BUFGs during global placement on non-timing critical nets based on resource (a BUFG)
availability. The placer also replicates high fanout nets and control signals to
DSPs/BRAMs. The placer replication is based on the placement and the distance that net
has to drive. The main advantage is for the user as no guesswork is needed at the
RTL/Synthesis stages to figure out which high fanout needs to be manually replicated.
The control set utilization will be optimal resulting in fewer designs being congested. The
new algorithm addresses timing critical high fanout nets like resets early in the
placement stage and reduces the need for post-place or post-route physical optimization
iterations.
Fig. 10 below shows one example of a really high-fanout net that was automatically
replicated by the placer. The immediate benefit can be seen by the nearly 0.500 ns WNS
improvement:
7/9
Fig. 10: High-fanout net automatic replication in the placer
Conclusion
In part 1 of this blog I posted that resets cannot and shouldn't be ad-hoc. It needs to be
planned very carefully early in the design phase. We also talked about the impact of
resets to power, the choice of active-high and active-low resets and whether to use
asynchronous or synchronous resets. Here is a summary of recommendations for resets:
1. Plan the reset architecture early in the design phase

2. At the board or design or hierarchical level decide the different type and number of
resets (power-on reset, system reset, software reset etc.) that will be active
3. Are the resets coming in Asynchronous or synchronous?
4. If the resets are asynchronous ensure that they are synchronized in each clock
domin
5. Ensure that at the hierarchical level, only synchronous reset is used
6. Establish a guideline for one type of reset for the entire design - either active-high
(recommended) or active-low
7. Establish guidelines for the min/max number of clock cycles the reset needs to be
active
8. Determine the minimum number of elements that the reset will drive (state-
machine registers, pipeline registers, datapath etc.) in every hierarchy of the design
and the initial state of those registers (should the flop be reset or set).
9. Ensure that the details from 8. above are documented in the micro-architecture
specification for each hierarchy/chip/board/system.
10. Avoid resets to big blocks like DSPs, URAMs, BRAMs and LUTRAMs, if possible
11. Avoid aggressive MAX_FANOUT constraints during the synthesis stage
12. MAX_FANOUT constraints with a value of 512/1024 on targeted high-fanout nets in
a particular hierarchy is fine
13. When combining resets ensure that the resets being merged are synchronized in
the destination clock domain, merged and then registered before driving the
destination registers.
8/9
14. Always drive reset from a flop
15. At the planning stage determine if all chips or all hierarchies within a design can be
reset simultaneously or needs to be sequenced based on specific clock domains
16. If necessary replicate resets with KEEP attribute and in each SLR
17. ..and finally use Vivado tool features like replication and auto-promotion of high-
fanout nets to a global clock throughout the flow (synthesis, opt_design,
place_design, route_design and phys_opt_design)
I am certain that there are many other subtleties and 'gotchas' that haven't been
discussed here. Reset is a Ph.D. thesis in itself in my opinion. If you have any further
insights, please feel to share it with the community by commenting on this post (the
comments are moderated by me but rest assured if the comment is relevant to the topic
you will see your comments).
9/9

Demystifying Resets Synchronous Asynchronous and Other Design Considerations Part 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Demystifying Resets Synchronous Asynchronous and Other Design Considerations Part 2

Uploaded by

Copyright:

Available Formats

Demystifying Resets: Synchronous, Asynchronous and

other Design Considerations... Part 2

Combining Multiple Resets

Reset Sequencing and Clock Domain Crossing

All the design hierarchies are reset simultaneously

Fig. 5: Block Diagram for reset sequencing

Fig. 6: Schematic for reset sequencing

The types of Flop-flops available in Xilinx FPGAs

Handling Resets in Vivado

Control Sets: A Primer

set_property CLOCK_BUFFER_TYPE BUFG [get_nets netName]

Handling High-fanout Nets

Fanout Optimization in Placer

1. Plan the reset architecture early in the design phase

You might also like