You are on page 1of 14

White Paper

My RTL is an Alien!
Accelerating and Automating ASIC to FPGA-Based Prototype Conversion

September 2013

Author Introduction
Angela Sutton FPGA-based prototyping is gaining popularity because it provides an economical way to functionally
Staff Product verify an ASIC design by creating a prototype that runs close to “at speed.” FPGA-based prototypes also
Marketing Manager, provide a great platform for early system software development. However, FPGA architectures include
Synopsys, Inc.
resources, building blocks, power circuitry, and clocks that are fundamentally different from those of
an ASIC. With over 70% of today’s ASICs and systems-on-chips (SoCs) being prototyped in an FPGA,
designers are looking for ways to ease the creation of FPGA-based prototypes directly from the ASIC
design source files.

This paper focuses on how to create an automated process that converts ASIC design source files into a
working FPGA. The techniques described will allow you to maintain one “golden” set of files that will work
in both your ASIC and FPGA design environment so that, with each new revision of your ASIC design, you
will be able to quickly create a revised FPGA-based prototype.

This whitepaper complements the FPGA-Based Prototyping Methodology Manual, and several other
documents that are available from Synopsys and listed at the end of this paper.

You will learn how to

1. Create a reproducible scripted process for converting your ASIC design into a working
FPGA-based prototype system. This is a way to batch automate the conversion process so that
it can be re-run with each new ASIC “design drop.” It can be done using a combination of netlist-
editing functions, compiler constraints, ’define and ‘ifdef directives, and Tcl/Find scripting.
2. Take ASIC design source files and add side files that render the design FPGA friendly to ensure
that the design is synthesizable into the FPGA architecture and system. Topics covered will
include reading the design in from the simulation verification environment, clock conversion,
memory conversion, low power circuitry, and how to accommodate designs that contain ASIC IP.
3. Obtain an initial working prototype as fast as possible.
4. Meet design performance targets in the resulting FPGA-based prototype by tuning the
initial prototype.
Creating a Scriptable, Reproducible Process
Design validation teams are faced with the task of implementing an ASIC design in an FPGA. The original
ASIC design may be specified as a directory of thousands of golden ASIC source files, making the conversion
process a challenge without the right tools in place. These ASIC files can, on face value, look completely
“alien” to the FPGA flow. The FPGA flow must accept and know what to do with RTL and constraint
specifications, as well as DesignWare® Library IP and any ASIC low power specifications. It must also
understand how ASIC source files inter-relate and reference each other, and honor design validation-specific
parts of the design that are expressed as cross-module references (XMRs) in the source files. Table 1 provides
more details.

Components of the ASIC What the FPGA flow must do


design specification
RTL Accept and convert the RTL
• SystemVerilog, Verilog and VHDL Language – Synthesize into an FPGA
• Assertions – Parse and ignore them
Timing constraints Accept Synopsys Design Constraints format (the de-facto ASIC constraints
standard)
UPF power intent Validate power retention/isolation circuitry that will be used in the ASIC
DesignWare Library IP Implement any DesignWare Library IP referenced by the ASIC RTL files
Design file inter- Understand file inter-relationships (Verilog include paths, libraries, macros,
relationships and cross- defines) so that module definitions referenced in one file can be found in another
references file and libraries can be read
XMRs Understand XMRs (global references to nets, registers, etc.) used for design
verification

Table 1: It is important that the FPGA and ASIC flow accept a common and
compatible set of golden design input files.

With this fundamental compatibility between ASIC and FPGA flow input in place, the FPGA validation
engineer needs to convert certain ASIC design structures such as clocks and memories so that they can be
implemented using the FPGA architecture. Conversion steps are described in the next section of this paper.
These steps can be automated using a combination of Synplify® Premier FPGA synthesis features that take
care of the conversion, along with ASIC conversion scripts that invoke and drive these features.

Expanded Tcl scripting capabilities that include Find/Collections functionality can provide direct access to and
traverse the design database and can generate highly customized scripts to transform and report upon the
status of the design. Refer to Table 2.

TCL script extension Purpose


Find • Access objects in RTL/netlist-level database
• Sift database by using Find in combination with a filter command
Expand • Traverse design hierarchy
Collections • Access groups of objects and their attributes and use for compatibility with
other ASIC tool scripts

Table 2: Tcl scripting can be used to run batch conversion of ASIC designs to FPGA-friendly designs.
Find, Expand and Collection extensions to Tcl provide custom design database access and reporting
of data in the design RTL and netlist-level database.

ASIC conversion inevitably involves removal of some circuitry that is not pertinent to the FPGA such as
power-down circuitry and large memories, and may also involve the insertion of test circuitry for testing the
prototype, once it has been implemented in the FPGA.

My RTL is an Alien! 2
The FPGA design automation software can offer a way to drive conversion of your ASIC to an FPGA, while
leaving golden ASIC source files intact. Synplify Premier and Certify® Tcl-based netlist editor commands and
FPGA synthesis compiler constraints can be created within side-files that are applied to the golden ASIC
source files and transform them so that they become FPGA synthesis-ready. Refer to Table 3.

FPGA side files Side file’s conversion purpose


Tcl-based FPGA netlist Scriptable commands that can perform major changes such as
editor • Remove or insert circuitry such as built-in self-test (BIST) circuitry or large
memories that will be implemented externally
• Replace very complex ASIC clocks with FPGA clocking
• Tie nodes to a constant value
• Insert data buffers on pins or ports for data collection
FPGA synthesis compiler Specific adjustments to the design to make it FPGA friendly. Examples include
constraints black boxing a module, renaming a module, or making a module unique such
that it can be treated and operated on in a different way compared with other
instances of the same module. Compiler constraint commands include:
syn_black_box
syn_noprune
syn_preserve
syn_sharing
syn_rename_module
syn_unique_inst_module

Table 3: Conversion of ASIC designs to FPGA-friendly designs can be accomplished using side
files that include netlist editor commands and compiler constraints.

Alternatively, RTL ‘ifdef and ‘define directives can be inserted directly in the golden RTL code, to specify which
RTL code should be used for the FPGA and which should be used for the ASIC, or to substitute or remove
circuitry prior to implementing the prototype. This requires that the golden ASIC RTL undergo a one-time
modification to accommodate the validation team’s FPGA prototyping flow.

Rendering the ASIC Design “FPGA Friendly”


FPGA architectures are fundamentally different when compared with an ASIC. These differences include
overall capacity, clocks and overall resource/building blocks availability. In order to make an ASIC design
completely FPGA friendly, or recognizable by the FPGA design tool flow used to implement the prototype, the
following items need to be considered:

How to import the design from the verification environment. Most designs will have been RTL
``
simulated before they are prototyped so the source files for the design would have been configured
for a simulation flow. The same golden files and directories that worked for simulation need to be
usable as input to the FPGA synthesis tool. It is key that the FPGA synthesis tool be able to deal with
file dependency specifications, and has the ability to directly read libraries of files.
How to automatically convert ASIC-specific design elements such as memories, IP and clocks. The
``
FPGA synthesis tool itself may have the ability to convert ASIC gated clocks into the equivalent FPGA
clock architecture. Some initial design setup steps, described next, will be required in order to take
advantage of automated conversion capabilities built into the FPGA synthesis tool.
(Advanced option). How to generate the source filename lists from a common repository, for
``
companies that use design repositories. In some design environments, advanced tools can be used
to extract the file-list from the repository, depending on the target FPGA technology. This item is
beyond the scope of this paper.

However, let’s explore the first two items.

My RTL is an Alien! 3
Importing the design from the simulation verification environment and preserving XMRs
It is quite normal for an ASIC design to comprise thousands of source files, arranged in hierarchical
directories. Moreover, the design itself is usually hierarchical. The FPGA flow must completely understand the
source file directory structure (the location of a file that defines a given module) and how files depend upon
and reference each other. File dependency specification is not new to simulators that use compiler directives
to express each of the following concepts.

Groupings of files that define modules of the design. These groupings are known as Verilog Libraries
``
and their filenames are allowed to have special suffixes, known as Library filename extensions.
Directory search paths, which designate the directories to be scanned in order to find the definition
``
of a module that has been referenced by one file but is defined in another file. These search paths are
specified as “include directories.”
The hierarchical scope, which defines the scope in a design hierarchy to which an RTL definition
``
(‘DEFINE) applies. This is of particular importance when there are duplicate module names in different
parts of the design hierarchy. Each part of the hierarchy should clearly reference the correct definition.

With the ability to understand the concepts of libraries and include paths, the FPGA synthesis tool can
successfully read and interpret the design, in its entirety, directly from the simulation environment.

Cross-module references (XMRs) describe global references to nets and registers. XMRs facilitate the
debugging of a design from a testbench, without the need to bring the signal being probed up to the top level
of the design via wires/port connections. XMRs may be present in the ASIC design specification. If they are,
they must be understood and the nodes in question maintained by the FPGA synthesis tool so that the FPGA
can be tested and probed on the board.

Using FPGA synthesis to automatically convert memories


When setting up a conversion process for a given design, decisions need to be made on how best to implement
the ASIC’s memories. The designer needs to ask, up front, whether they want the prototype to contain

Exactly the same memory as in the ASIC; for example, should the designer create a DDR4 or DDR3
``
sub-design?
or

Would a model of the memory be a good compromise? For example, a wrapper to on-board or
``
on-chip memory

The method chosen will be influenced by the size of the memory, performance requirements, and the
availability of memory resources on the FPGA chip or on an existing FPGA-based prototyping board.

Implementing very large ASIC memories


When the ASIC memory is huge, it is highly likely that it will overflow available FPGA resources if the memory
is implemented on the FPGA fabric itself. Therefore, in such cases, the memory should be implemented using
separate off-chip memory resources, external to the FPGA – a DDR4 or DDR3 sub-design, for example.
The external memory will need to be connected to and synchronized with the FPGA chip. This process of
extracting the memory for implementation elsewhere can be accomplished using a “wrapper.” Netlist editor
and compiler constraint side files can be used to extract out the memory function and substitute it for a black
box equivalent model, each time the FPGA is synthesized. More details on the use of wrappers can be found
in Chapter 7 of the FPGA-Based Prototyping Methodology Manual.

Implementing small memories on the FPGA


Smaller ASIC memories can generally be implemented on the FPGA chip itself. The memory may be
implemented either using dedicated FPGA block-RAM resources or possibly using a combination of
multiple registers, also known as distributed RAM.

My RTL is an Alien! 4
In the absence of any user-specified preference, the synthesis tool will automatically infer an implementation
from the RTL code, based upon resource availability and performance/area considerations.

The synthesis tool will report any memory inference that it performs via a separate report where lines such as
this will be recorded:

@N:CL134 : ram_1.v(17) | Found RAM mem, depth=32, width=4

Designers can identify all the RAMs in the design from the FPGA synthesis log file. For example

#### START OF Block RAM DETAILED REPORT ####


Total Block RAMs: 1
mem_mem_0_0
----------------------------
READ_WIDTH_A 4
WRITE_WIDTH_A 4
READ_WIDTH_B 4
WRITE_WIDTH_B 4
WRITE_MODE_A READ_FIRST
SRVAL_A 0
----------------------------
#### END OF Block RAM DETAILED REPORT ####

Because resource availability may be a concern, the FPGA synthesis Resource Utilization section of the
synthesis log file will report what percentage of memory resources are used. For example, you would see a
resource report that looks like this:

RAM/ROM usage summary


Block RAMs: 1 of 156 (0%)
Register bits not including I/Os: 250 (5%)

Which memory to use – dealing with overutilization


As previously mentioned, an FPGA has finite block RAM and register resources for use in implementing the
design’s memories. If too many FPGA block RAMS are used, designers will know about it right away because
they will see a high utilization of over 100% reported in resource utilization section of the synthesis log file.
For example:

RAM/ROM usage summary


Block RAMs: 160 of 156 (102%)
Register bits not including I/Os: 2500 (50%)

All is not lost! If register resources are available, the synthesis tool can automatically use those instead to
implement the memories. In so doing they are creating a distributed RAM. What is more, designers can force
the software to use either a block RAM or a distributed RAM using the syn_ramstyle attribute.

Using FPGA Synthesis to Convert Gated and Generated Clocks for


Low Clock Skew
In an ASIC design, designers can build clock trees for low skew and it is common to see gated and generated
clocks, as well as Integrated Clock Gating (ICG) ASIC cells present. FPGAs have fundamentally different clock
architectures from ASICs. In FPGA architectures, the built-in clock trees are limited in number (32 global clock
buffers for Xilinx Virtex®-7 and Virtex-6 and 16 for Altera Stratix® V devices). Leaving the original ASIC clocks
intact would overflow the available FPGA clocking resources. Moreover, gated clocks, put in place for the
purpose of ASIC power reduction, add no value in the FPGA prototype and, in fact, just consume resources
and add clock skew. Let’s take a look at the specific types of clocks that you will find in an ASIC and how to
convert them.

My RTL is an Alien! 5
Generated Clocks include clocks that are created by multiplying, dividing, or time shifting an existing clock, or
clocks that were created by some clock input switching or alignment mechanism. To make these clocks FPGA
friendly, you will need to replace them with equivalent FPGA structures. You can accomplish this using the
netlist editor or by the use of ‘ifdef FPGA, ‘ifdef ASIC statements in your RTL.

Gated clocks are clocks that are predicated on a signal that “opens the gate” and allows the clock signal to
propagate. In ASICs, these gated clocks generally serve to save power or to minimize the impact of clock
glitches (latch-based clocking). There are, in fact, many different types of ASIC gated clocks – including those
that use AND, NAND, OR, NOR, XOR, MUX, or latch-based clock gating.

Integrated clock gating (ICG) ASIC cells such as the one shown in Figure 1 combine both a latch and clock
gate and serve to cut ASIC clock skew. You may even see cascading ASIC ICGs. Designers will need to
replace these gated clocks with an equivalent FPGA RTL presentation using the netlist editor to edit out the
old cell and insert an FPGA equivalent cell.

En Latch

Gclk
Clk

Figure 1: An ICG cell that combines both a gated clock and a latch and may
be present in ASICs to control clock glitches. These can be replaced by an
RTL equivalent FPGA representation.

The first step in transforming these ASIC clocks into FPGA-friendly clocks is to preview the clocks by running
an early clock report that notes generated and gated clocks and that flags other clocks that were NEITHER
gated nor generated, as shown in Figure 2.

Figure 2. Clock conversions that occur during synthesis can be observed in a Clock Conversion Report that
includes clickable links to each converted clock in the netlist-level schematic.

My RTL is an Alien! 6
As previously mentioned, IGCs should be substituted out using a netlist editor script. The FPGA synthesis
software will accomplish a conversion of all other RTL forms of gated and generated clocks automatically by
moving generated clock and gated clock logic from the clock pin of a sequential element to the enable pin.
This allows sequential elements to be tied directly to the source clock, removing skew issues and reducing
the number of clock sources required in the design. An example of an automatic gated clock transformation is
shown in Figure 3.

D D Q

En

Clk

ASIC gate clock

Gated clock
conversion
D Q
D D D Q
En En En
Clk

Clk
FPGA clock converted

Figure 3: AND, NAND, OR, XOR gated clock structures are converted by
moving the clock pin to the register enable pin.

When setting up the design to allow automatic clock conversion, care should be taken to go back and analyze
the Clock Conversion Report (Figure 2) that is available after synthesis compilation since it will show which
clocks successfully converted, and which did not. Correct set up of clock constraints is essential to the
success of the conversion with constraint specification, as omissions are the most common cause of failure of
a clock to convert. Omissions include:

Missing clock constraints, which cause a clock that feeds a sequential element to be missing
``
Misplaced clock constraints, where a clock has become disassociated from its true source for some
``
reason, such as the creation of a black box between the clock source and its destination

Indeed, the clock report provides a checkpoint that lets you ensure that you have correctly synchronized
clocks relative to each other.

Dealing with User-Defined Primitives (UDPs)


UDPs are truth-table-like specifications of the behavior of an element or module that you wish to implement.
Although UDPs may appear in ASIC validation source files, neither FPGA synthesis nor ASIC synthesis will
synthesize them. The FPGA synthesis compiler for example, will identify UDPs and label them as a compiler
error – indicating where the UDP definition appears in the source code.

Selecting top level module test


@E:CG731 : test.v(3) | Can’t synthesize UDP primitives
@END

To convert the UDP, designers may use the Synplify Premier/Certify netlist editor to replace it with equivalent
RTL for the UDP’s function.

My RTL is an Alien! 7
Converting IP
Depending on size and other decisions such as performance, digital IP blocks may either be prototyped on
the FPGA chip itself or using external chipsets or daughter cards. IP such as microprocessors may even be
replaced by executable software models. For example, transaction level models for ARM® and Synopsys
DesignWare cores can be run on a host PC using a virtual prototyping solution. The PC running these
models uses a UMRbus to operate in conjunction with a Synopsys HAPS® Prototyping System that contains
the remainder of the prototype. This approach can save significant prototyping time and deliver a working
prototype sooner.

Implementing Synopsys DesignWare Building Block IP on the FPGA


Synopsys’s DesignWare Digital IP portfolio includes interface IP solutions such as controllers, and the
DesignWare Building Blocks library of over 140 datapath, FIFO, arithmetic, controller, memory and floating
point functions. Please refer to Figure 4.

DesignWare Library
AMBA ® Fabric (AXI™, AHB™, APB™)
Microcontrollers
and APB advanced peripherals
AMBA peripherals
APB peripherals
Microcontrollers (DW8051, DW6811)
Memory (FIFO/FIFO controller, synchronous and asynchronous RAM)
Datapath (complex arithmetic, floating point, trigonometric...)
Building Blocks
Data Integrity (CRC, ECC, 8b10b...)
DSP (High-speed digital FIR and IIR, sequential digital FIR )
Test (JTAG, boundary scan, TAP controller...)

Figure 4: Synopsys FPGA synthesis tools implement DesignWare Library directly from the ASIC RTL.

When prototyping an ASIC within an FPGA, the FPGA synthesis software knows how to automatically
implement RTL source files that include instantiated DesignWare Library Building Block IP. Thus, the ASIC RTL
can be synthesized directly into an FPGA with no RTL changes required. Microcontrollers, AMBA and all other
DesignWare digital cores are configured by the Synopsys coreConsultant tool and, once configured, can also
be directly synthesized into an FPGA.

Using DesignWare Library components in Synopsys’ FPGA-based prototype is simple. If you are using
DesignWare Building Block IP in your FPGA flow, first ensure that the libraries are accessible to Synopsys’
Synplify Premier and Certify FPGA synthesis tools. On the other hand, the DesignWare microcontrollers and
AMBA peripherals can be configured using the coreConsultant tool and should then be included in your
Synplify Premier/Certify design project prior to synthesis. These flows are depicted in Figure 5.

My RTL is an Alien! 8
DesignWare Microcontrollers and DesignWare Building Blocks in FPGA design
AMBA in the FPGA design

coreConsultant
and coreAssembler Synthesis project
files
Configure (RTL, constraints,
DesignWare IP scripts) Synthesis project files
RTL with instantiated
DesignWare DesignWare components
Building Blocks constraints
DesignWare Core library scripts
model (encrypted RTL)
(encrypted RTL)

Synplify Premier Synplify Premier


Certify Certify

Synthesize Synthesize

Netlist + Netlist +
constraints constraints

FPGA P&R FPGA P&R

Figure 5: Synplify Premier and Certify FPGA synthesis read DesignWare Building Block Components and
pre-configured DesignWare Library microcontroller/AMBA IP directly, allowing you to use the same
RTL source for your ASIC and its FPGA prototype.

Including complex DesignWare cores in an FPGA prototype


Larger digital DesignWare IP cores, including PCIe, USB 3.0, MIPI, DDR, SATA and HDMI, can be prototyped
in an FPGA. The designer will need to include the ASIC RTL for the core, as generated by Synopsys
coreConsultant software, in their FPGA synthesis project directory.

Using virtual transaction-level models


The ASIC design validation and software development team will ultimately need to perform the following tasks.

1. Controller + PHY interoperability validation


2. System compliance testing
3. Subsystem integration
4. Early firmware and software development

These tasks can be eased by modeling some of the IP as transaction-level models. The Synopsys Virtualizer™
virtual prototyping tool set allows developers to start the development of software for the IP or the entire SoC
significantly earlier than traditional methods by allowing the Synopsys HAPS FPGA-Based Prototyping System
to directly interact with executable software IP, such as ARM processor models, running on a host PC.

Synopsys offers a complete prototyping system – HAPS. The risk associated with the first two tasks of
compliance testing and controller/PHY interoperability is significantly reduced when using the DesignWare
cores with Synopsys’ HAPS FPGA-Based Prototyping System motherboards, since these tasks have already
been performed before by Synopsys.

My RTL is an Alien! 9
Incorporating FPGA vendor IP
FPGA vendor IP such as PCIe and DDR3, as well as MIG, may be used for the purpose of implementing
debug-oriented interfaces to the prototype system. The synthesis flow can be scripted to incorporate and
optionally re-optimize FPGA vendor cores. These cores may be either IEEE-P1735 encrypted or completely
unencrypted and humanly readable. They are generated and configured using the FPGA vendor’s core
generator, such as the Vivado IP Catalog software from Xilinx.

Unified Power Format (UPF)


IEEE 1801-2009 Unified Power Format is a standard that allows ASIC designers to focus upon power
considerations and low power intent early in the design process. It makes sense for FPGA prototyping
flows to model the concepts of low power intent so that the prototype identifies possible ASIC power
management bugs and errors and validates ASIC low-power circuitry. This allows power-control logic and
algorithms to be validated at speed in a prototype to ensure that this circuitry will never corrupt data or the
operation of the design.

It should be noted that FPGA power domains differ greatly from those found in ASICs. For example, FPGA
power is “volatile,” meaning it must be reconfigured and loaded upon board power-up, and generally there are
no configurable I/O voltage cells or power switches available to an FPGA.

However, an FPGA-based prototype can and should model separate power domains, and how each power
domain is to be isolated. It should also model critical state values that are to be remembered and restored
when a powered-down circuit is powered-up again. UPF commands that define power / ground nets or
switches are not converted or implemented in the FPGA prototype since such structures do not exist at all in
an FPGA. Please refer to Table 4.

UPF power intent support in Description


FPGA-based prototypes
Power domain A scoped selection of modules/elements that belong to a specific power source/
network and that have a specified output behavior when powered down. In the
FPGA-based prototypes, these are modeled by virtual power domains (since
FPGAs themselves only have a single source and ground).
Isolation cells Each power domain has an isolation cell that sits between power domains so
as to prevent unexpected behavior when its power domain is powered down. At
power-off, isolation cells clamp the output of a powered-down power domain to
a known value.
Retention circuitry Circuitry to retain states or values after a power domain is powered down. The
state is saved upon power-down and restored upon power-up. It is used to retain
values for registers, sequential shifters, state machine and RAMs

Table 4: It is vital to verify the operation of the ASIC’s lower power circuitry.
The FPGA-based prototype should include implementation of power domains, isolation cells and retention
circuitry that have been specified in the UPF format.

A separate UPF file can be read by certain FPGA synthesis tools. This file defines virtual power domains,
isolations cells and retention circuitry, along with the hierarchical scope or set of objects to which each
applies. The synthesis tool will then create an optimized netlist that includes power retention and/or isolation
circuitry, ready for verification in prototypes such as the HAPS FPGA-Based Prototyping System. 

My RTL is an Alien! 10
Obtaining that First Working Prototype as Quickly as Possible
In the early phase of developing the prototype, efforts inevitably focus on obtaining a system with a “pulse.”
The first implementation will not necessarily operate on the board at the target performance, but it allows for
basic clock and I/O configuration to be done, daughter card interfaces to be set up and basic register read/
write tests to be performed. At this stage, FPGA design activities will center on getting very fast design spins
so as to “pipe-clean” the design. The goal is to flush out basic compilation and setup errors as quickly as
possible. Table 5 summarizes some of the techniques that can be used to obtain a first working prototype in a
timely manner. These approaches can be combined as the designer wishes.

Challenges in gaining first Solution


working prototype
Identify and resolve what • Find, report and flag most errors in a single synthesis run using “continue-
may be hundreds of design upon-error” flow
compilation errors and proceed • Early conversion reporting on clocking, port mismatches and constraints setup
in the presence of errors errors
Apply fixes for compilation • Surgically apply change or fix the design without resynthesizing the rest of the
errors, and specification design. Hierarchical “bottom-up” flows work well for this scenario
changes
Know quickly whether design Up to 10x synthesis iteration runtime speed up by applying the following
changes worked techniques
• Fast synthesis modes – synthesize with fewer optimizations, in the interest of
better runtime
• Multiprocessing – divide synthesis task across multiple processors
• Incremental design - resynthesize only the parts of the design that changed
• Hierarchical program management – leverage multiple servers; enable team
members to work in parallel
Partition prototype across • Optimum partitioning and I/O assignment/multiplexing to meet resource, timing
multiple FPGAs budgets and performance goals and allow system debug visibility
Board bring-up and • HAPS – Quick Partitioning Technology implements prototype using multiple
distribution FPGAs
• HAPS – Hardware Scan & Assembly Validation

Table 5: An initial prototype can be achieved in a tenth of the time using a combination of faster synthesis modes,
divide and conquer incremental approaches, along with clever, automated synthesis techniques that allow you
to find and fix multiple errors. Automatic partitioning across multiple FPGAs can be performed. In addition, the
Synopsys HAPS prototyping system performs self-checks that enable rapid system bring-up.

In Table 5, continue-on-error synthesis mode allows synthesis to proceed and complete in the presence of
design errors – the tool completes what it can. At the end of the synthesis run, a comprehensive list of the
errors encountered will be made available and the designer will debug and incrementally fix them. Contrast
this to traditional synthesis runs that would terminate upon encountering the very first error, potentially
hours into the synthesis process, leaving the designer to fix the error, and then start synthesis again from
the beginning, only to find a second error. Identification in this manner of basic design setup and referencing
errors for clocks, constraints, RTL, port mismatches and missing files or references are all of key importance.

Synthesis runtimes in the largest of today’s designs can span half a day but there are several great options
available that will result in a greatly reduced design turnaround time, as shown in Table 5. The approach
chosen to accelerate runtime will depend upon available machine resources, the team and machine
composition and how much performance and area quality of results the designer is willing to sacrifice during
the synthesis run.

My RTL is an Alien! 11
For example, a “fast synthesis mode” can be chosen at the flip of a switch in the Synplify Premier software.
Fast synthesis mode provides a 4x speedup at the cost of performance and area quality of results (QoR).
Synplify Premier FPGA synthesis software can also run on multiple processors. Multiprocessing is similarly
enabled with a flip of a switch. Multiprocessing delivers up to 3x runtime improvements, coupled with GREAT
performance and area QoR. You can have your cake and eat it! However, multiprocessing will consume
additional machine resources. Designers can also use hierarchical and incremental flows to apply divide-
and-conquer approaches, which save runtime and have the benefit of preserving working parts of the design.
Designers can iterate on just one small portion of the design and the team can work productively in parallel on
individual pieces of the design and combine the results later.

In some cases, the prototype is so large that it will span multiple FPGAs. Challenges include dealing with inter-
chip timing, optimal I/O assignment and in gaining a clock distribution that works and exhibits minimal skew.
Synopsys Certify software automates the partitioning and I/O multiplexing process.

Initial bring-up of the prototype board can be quite time-consuming and error-prone. Challenges include
ensuring that the cabling, system voltage banks, adapters, and clocks are all set up correctly, that any
daughter cards are connected at the correct locations and that SERDES banks are stable. In addition, host
interfaces to external computers must be fully operational, so that sample data can be extracted from the
prototype, and any software models that interoperate with the hardware prototype (hybrid prototyping) run
flawlessly. As shown in Figure 6, the HAPS Prototyping System performs live scanning of the board to check
clocking, voltage banks, cabling, and daughter card configurations. High-Speed Time Domain Multiplexing
(HSTDM) interfaces between FPGAs and interface performance links are also analyzed to make sure that they
match the design database. In addition, the UMRbus host interface is checked. A first-pass Verilog board file
(.vb) is then automatically generated.

HAPS Aware Features

Validate Validate Validate Validate Validate


daughter board connector HSTDM clock UMRBus
location location links configuration link

Gigabit
ethernet

DDR2

PCI-X

DVB
UMRBus
interface kit
PCIe Host system,
interactive and
DVI script-based

USB 2.0

Figure 6: The Synopsys HAPS Prototyping System offers automation for quick partitioning across multiple
FPGAs, as well as assembly validation and automated hardware scan that together accelerate initial board bring-up.

My RTL is an Alien! 12
Meeting FPGA Design Performance Goals
FPGA-based prototypes provide the means to create a high performance, at speed working prototype of the
eventual ASIC design. With the initial prototype now working, it is time to tune its performance. The cornerstone
of achieving performance targets is the combination of good timing constraints setup and best design practices.

Setting constraints to meet timing performance


Setting adequate and correct constraints for FPGA synthesis will greatly impact performance results. You will
want to avoid over-constraining the design, which will probably result in longer synthesis and place & route
runtimes and potentially false critical paths being reported. For example, you will want to make sure that you
have specified multi-cycle and false paths and that you have set proper constraints on derived clocks.

The Synplify Premier and Certify tools also include a constraints checker function that checks for the correct
application of constraints and instance names. For example, it will flag timing constraints applied to non-existent
or invalid types of arguments and objects. The tool then generates a detailed explanatory report so that the
constraints file can be corrected. The software also runs clock synchronization checks and generates a “clock
domain crossing” report that lists all paths that start in one clock domain and end in another. The synthesis tool
will however assume that clocks are synchronous unless you specify that they are asynchronous.

Synthesis settings and techniques for better performance QoR


Aside from setting complete and accurate constraints, there are several “best practice” guidelines to follow in
order to improve the timing performance of a prototype that is being implemented on an individual
FPGA, including

Instructing the FPGA synthesis tools to reduce high fanout nets, increase its use of pipelining and use
``
sequential optimizations and retiming
Removal of unneeded ASIC resets, including asynchronous resets on multipliers
``
Determination of the constants that did not get propagated across hierarchical boundaries and thus
``
impeded performance optimization
Assignment of critical paths to bounded regions and to the same die of multi-die FPGA devices
``
Isolation of the critical path inside a hierarchical block so that the designer can easily iterate on the
``
timing-critical parts of the design

All of these techniques can be accomplished by using the Synplify Premier and Certify FPGA design tools.

Conclusion
This paper describes how to set up an automated process that converts ASIC design source files into a
working, high-performance FPGA-based prototype, such as the HAPS FPGA-Based Prototyping System. The
techniques described allow designers to maintain one golden set of design files that can be shared by both
the ASIC and its FPGA-based design. Thus, with each new revision of the ASIC design code, it is possible to
quickly create a correspondingly revised FPGA-based prototype.

Synplify Premier and Certify software from Synopsys include netlist editing and compiler constraints
capabilities that automate the conversion process, allowing you to automatically remove or substitute
pertinent circuitry such as large RAMs when creating the FPGA-based prototype of the ASIC. Synplify Premier
and Certify software offer a high degree of “FPGA conversion automation” including the ability to convert
gated and generated clock conversion so as to achieve economical clock resource usage and reduce the
clock skew in the prototype. The software automatically understands how to create a prototype of a design
that includes DesignWare Library IP and can incorporate FPGA vendor IP.

This paper also offered a number of techniques that will help the FPGA prototype to gain a fast initial
implementation of the design and then tune the design to meet performance targets.

My RTL is an Alien! 13
For More Information
The paper complements the following documents, all of which are available for free from Synopsys.

FPGA-Based Prototyping Methodology Manual, Doug Amos, Austin Lesea, and René Richter (ISBN:
``
978-1-61730-004-2). Available in Japanese and English
Breaking the Three Laws Blog: http://blogs.synopsys.com/breakingthethreelaws/
``
“10 ways to Debug your Design” White Paper. Angela Sutton. Describes how to get early insight into
``
design specification, isolate errors, and correct setup issues in order to fix your prototype, when the
design fails to synthesize or fails to operate as expected on the board. It also details how to flush out
find design specification errors early and en masse using Continue-on-Error and hierarchical design
fix-up methodologies.
“Methods and Tools for Bring-Up and Debug of an FPGA-Based ASIC Prototype”. White Paper. Troy
``
Scott. Discusses prototyping Setup and Methodology and how to efficiently bring up and debug
multi-FPGA-based prototyping systems.
“FPGA Design Methods for Fast Turn Around”. White Paper. Angela Sutton. Describes how to
``
accelerate FPGA design flow runtimes and apply incremental and hierarchical techniques both for
runtime acceleration and for preservation of pre-verified parts of the design.
Inferring Xilinx RAMs Application Note. Synopsys SolvNet
``
Analyzing Conversion Issues with Gated Clocks and Generated Clocks Application Note. Synopsys
``
SolvNet
Using Vivado IP with Synplify Application Note. Synopsys SolvNet
``
A Simple Way to Use DesignWare Libraries in FPGA-Based Design Prototypes. Synopsys
``
DesignWare Technical Bulletin, Angela Sutton

Synopsys, Inc.  700 East Middlefield Road  Mountain View, CA 94043  www.synopsys.com
©2013 Synopsys, Inc. All rights reserved. Synopsys is a trademark of Synopsys, Inc. in the United States and other countries. A list of Synopsys trademarks is
available at http://www.synopsys.com/copyright.html. All other names mentioned herein are trademarks or registered trademarks of their respective owners.
10/13.CE.CS3535.

You might also like