You are on page 1of 3

9.5.2 How DSP Blocks Are Mapped?

The Virtex-7 has the dedicated DSP block and to improve the FPGA performance,
and to avoid the use of the other functional blocks, the synthesis tool can map the
DSP functionality using the dedicated DSP blocks.
Consider the multiply and accumulate function (MAC) this can be mapped using
the DSP block for the improved performance. If the DSP algorithm ismore complex,
then multiple DSP blocks can be used to implement the wider range of arithmetic
algorithms. Ifwewish to have the pipelining in the DSP algorithm, then the
pipelined
logic can be mapped by using the DSP blocks.
9.5.3 How Memory Blocks Are Mapped Inside FPGA?
We need to have the internal storage in the form of the distributed RAM or block
RAMs. The performance of the overall design depends upon the mapping of these
blocks efficiently by synthesis tool. What we need to look into is that whether the
synthesis tool infer these blocks or not?
The single- and dual-port RAMs should be inferred by the synthesis tool
automatically,
and for the improved design performance, the adjacent pipelined registers
should be picked up automatically at the input and output. For improved speed and
area, it is essential that the synthesis tool can split the large memories into the
multiple
BRAMs and even they should be able to have the address and data according
to the functional requirements.
If we consider the synthesis tool features, then for the SOC design the synthesis
tools are smart and intelligent enough due to use of the algorithms to partition
and
realize the design. During prototyping, the designer should take care of use of the
170 9 ASIC and FPGA Synthesis
vendor-dependent features of such tools to achieve the desired results. Most of the
time we observe that such process is automated in the industry by using the
synthesis
scripts.
But practically what care design team should take: Let me share my experience
when I was working on one of the SOC projects during past decade!
1. The first important point I thought by visualizing the architecture that the
design
is too complex and needs partitioning.
2. If my SOC functionality is larger than FPGA what I should do? I need to use the
multiple FPGAs.
3. Is it possible that I can achieve the desired speed of SOCs using FPGAs?
Practically
not because the SOC speed is faster as compared to FPGA.
4. Is it thatmyRTL can directly map on the FPGA? Answer is no; I need to tweak the
RTL and make the changes and make it FPGA resource compliant, For example,
gated clock implementation in the ASIC and FPGA differs.
9.6 Practical Scenarios During FPGA and ASIC Synthesis
This section describes few of the practical scenarios during the ASIC and FPGA
synthesis.
9.6.1 Gated Clocks and Conversions
The clock gating conversions can be accomplished at the RTL level as well as using
the EDA tool features. Using back-end flow during the clock tree synthesis, the
clock
buffers can be added to balance the clock skew. The clock tree with the balanced
clock skew can be routed to get the better timing and performance. But this is not
possible for the FPGA design flow. This section describes the clock gating
technique
for the ASIC/FPGA designs.
When the design functional block needs to be inactivated then the clock can be
stopped by using the clock gatingmechanism. This is used to save the dynamic power.
At the RTL level, this can be accomplished by using the clock and the clock_enable
inputs and shown in Fig. 9.5.
9.6.2 Gated Clock Implementation for ASIC
For the ASIC designs, the clock gating can save the significant amount of the
dynamic
power. The clock gating cells are available in the library. If enable clock gating
options
9.6 Practical Scenarios During FPGA and ASIC Synthesis 171
Fig. 9.5 Gated clock design
Fig. 9.6 Clock gating cell
are used according to the design requirements, then these cells can be inferred by
the
synthesis tool.
The clock gating cell is shown in Fig. 9.6.
9.6.3 Gated Clock Implementation for FPGA
For theFPGAdesigns, the clock gating cells used in theASIC need to be implemented
at the FPGA fabric level. These cells as shown can be implemented using the LUT if
and of clk and clk_enable is used. But the issue is the glitches in the clock as
AND
logic switch is in the clock path. So by using vendor-specific EDA tool options,
they
can be implemented as shown in Fig. 9.7.
9.7 Important Takeaways and Further Discussions
Following are important takeaways from this chapter.
1. The ASIC synthesis infers the gate-level netlist using the ASIC cell library.
2. The FPGA synthesis infers the gate-level netlist using FPGA functional blocks
such as CLBs, IOBs, DSP, clocking network, BRAMs.
172 9 ASIC and FPGA Synthesis
Fig. 9.7 Clock gating for FPGA
3. The synthesis tool uses the libraries, RTL design, and constraints to perform
the
synthesis.
4. The optimization constraints are speed, power, and area.
5. The synthesis can be performed at the block level and at the chip level.
6. The constraints can be one of the input file to synthesis tool using the (.sdc)
file.
7. The design rule constraints can be max transition, max or min capacitance, and
cell degradation.
8. The design partition for the larger SOC design can yield a better performance.
9. The clock gating logic for the ASIC and FPGA is different. So during the SOC
prototype, it is essential to have gated clock conversion.
The next chapter will focus on the static timing analysis (STA), and how it is
different
for the FPGA and ASIC designs. The chapter is useful for the SOC prototyping
to understand the timing and time budgeting at the different FPGA boundaries and
interfaces.
Reference
1. www.synopsys.com
Chapter 10
Static Timing Analysis
Under thermal equilibrium the product of the free electron
concentration and free hole concentration is equal to a constant
equal to the square of intrinsi carrier concentration.
Mass action law for semiconductor
Abstract The chapter discusses the static timing analysis (STA) and the role of the
STA engineer. The timing paths, maximum frequency calculations, input insertion
delay, and output insertion delays are discussed in this chapter with the practical
scenarios.
The Synopsys PT commands are discussed in this chapter. How to achieve the
timing performance to meet the timing constraints is also discussed with the
practical
scenarios. The chapter is useful for the ASIC and SOC designers to understand the
STA concepts and techniques to overcome timing violations in the design. Even this
chapter discusses the FPGA timing analysis.
Keywords STA · DTA · Timing paths · Reg to output · Input to output
Reg to reg · AT · RT · Slack · Skew · Setup · Hold · Clock to q delay
Delay derating · OCV · Dynamic simulation · Test vectors · Coverage
10.1 Synchronous Circuits and Timing
Meeting timing of the synchronous circuit is the important task, and during STA
all the timing paths are analyzed by the timing analyzer. Consider the sequential
synchronous circuit shown in Fig. 10.1.
As shown in Fig. 10.1, the synchronous sequential circuit is driven by the common
clock source and named as ‘clk’. The outputs are Combo_out and q_out. Input to the
sequential circuit is d_in.
The timing parameters of the flip-flop are
• Setup time (tsu)
• Hold time (th)
© Springer Nature Singapore Pte Ltd. 2019
V. Taraate, Advanced HDL Synthesis and SOC Prototyping,
https://doi.org/10.1007/978-981-10-8776-9_10
173
174 10 Static Timing Analysis
Fig. 10.1 Synchronous sequential circuit
Fig. 10.2 Setup and hold time of flip-flop
• Propagation delay of flip-flop (Clock to q delay) (tctoq or tpdff)
To have an understanding of these parameters, let us consider Fig. 10.2.
Setup Time: It is defined as the minimum amount of time for which data input
(Din) of a sequential element must be stable before the arrival of the active clock
edge (clock transition). In this book, the setup time is denoted by tsu.
Hold Time: It is defined as a minimum amount of time for which the data input
(Din) of the sequential device must be stable after the arrival of the active clock
edge
(clock transition). In this book, the hold time is denoted by th.
Clock to q Delay: It is the propagation delay of the sequential circuit element
after the arrival of the active clock edge (clock transition). In this book, the
flip-flop
delay is denoted by tpdff.
If setup or hold time is violated, then the sequential logic goes into metastable
state. During timing analysis, the timing analyzer checks for all the timing paths
to
make sure that the timing constraints are met.
10.2 Metastability 175
Fig. 10.3 Design with metastable state
Fig. 10.4 Timing sequence with metastable output
10.2 Metastability

You might also like