ASIF: Application Specific Inflexible FPGA: Husain Parvez, Zied Marrakchi, Habib Mehrez

ASIF: Application Specific Inflexible FPGA
Husain Parvez, Zied Marrakchi, Habib Mehrez

LIP6, Université Pierre et Marie Curie
4, Place Jussieu, 75005 Paris, France
{parvez.husain, zied.marrakchi, habib.mehrez}@lip6.fr
Abstract—An Application Specific Inflexible FPGA (ASIF) is ASIC for high volume production [8]. In this regard, Altera has
an FPGA with reduced flexibility that can implement a set of proposed a clean migration methodology [9] that ensures an
application circuits which will operate at mutually exclusive equivalence verification between FPGA and Structured-ASIC.
times. These circuits are efficiently placed and routed on an
FPGA to minimize the total routing switches required by the However migration of an FPGA application to Structured-
architecture. Later all the unused routing switches are removed ASIC supports only a single circuit. HardCopy or EasyPath
from the FPGA to generate an ASIF. An ASIF for a set of totally lose the quality of an FPGA to use the same hardware
17 MCNC benchmark circuits is found to be 5.43 times (81.5%) for executing multiple applications at different times. An ASIF
smaller than a mesh-based unidirectional FPGA required to map retains this property and can be a possible future extension for
any of these circuits.
HardCopy and EasyPath.
The concept of an ASIF is similar to configurable ASIC
I. I NTRODUCTION
cores (cASIC) [10]; cASIC is a reconfigurable device that
Low volume production of FPGA-based solutions are quite can implement a limited set of circuits which will operate
effective and economical because they are easy to design and at mutually exclusive times. However, cASIC and ASIF have
program in the shortest possible time. The generic reconfig- several major differences. cASIC supports full-word logic
urable resources in an FPGA can be programmed to execute blocks, such as 16-bit wide adders, multipliers, RAMS and
a vast variety of applications. This flexibility of an FPGA registers. It is intended to execute data-path circuits to ac-
enables implementation of different circuits on it at mutually celerate a domain-specific Systems-on-a-Chip. However, in
exclusive times, or even allows dynamic reconfiguration for this work an ASIF supports only fine-grain logic blocks. The
varying requirements. However, all these advantages come routing network used by cASIC and ASIF are totally different.
with a huge cost. FPGAs are much larger, slower, and more cASIC uses 1-D segmented bus interconnect, whereas ASIF
power consuming than their counterpart ASICs (Application uses 2-D mesh interconnect. Another major difference between
Specific Integrated Circuits) [1]. Consequently, FPGAs are cASIC and ASIF is the approach with which their routing net-
unsuitable for applications requiring high volume production, works are optimized. cASIC is generated using a constructive
high performance or low power consumption. However, when bottom-up “insertion” approach with reconfigurability inserted
an FPGA-based product is in the final phase of its development through the addition of multiplexers and demultiplexers. On
cycle, and if the set of circuits to be mapped on the FPGA the contrary, an ASIF is generated using an iterative top-down
are known, it can be reduced for the given set of circuits. “removal” technique in which different circuits are mapped on
This reduced FPGA can be called as an Application Specific an FPGA; flexibility is removed from it to support only the
Inflexible FPGA (ASIF). An ASIF can give considerable area given set of circuits. The benefit of a “removal” approach over
and performance gains to an FPGA-based product by reducing an “insertion” approach is that any existing FPGA architecture
it to a much smaller multiplexed circuit that can implement a can be reduced to an ASIF using this “removal” technique.
set of circuits exclusively. This paper considers only the area optimization of an ASIF.
An ASIC has speed, power and area advantages over an The delay and power parameters are not explored in this
FPGA, but at the expense of higher non-recurring engineering work. Section II describes a reference FPGA architecture that
(NRE) cost and higher time-to-market. However, the NRE is reduced to an ASIF. Section III discusses different ASIF
cost and the time-to-market is reduced with the introduc- generation techniques. Section IV describes an area model that
tion of a new breed of ASICs called as Structured-ASIC. will be used to compare an FPGA with an ASIF. Section V
A Structured-ASIC contains an array of optimized elements presents various experimental results using a set of MCNC
which implement a desired functionality by making changes benchmark circuits. The effect of Look-Up Table (LUT) size
to the few upper mask layers. Structured-ASICs are explored on ASIF area is also presented. Finally section VI presents the
or manufactured by several companies [2] [3] [4] [5]. FPGA conclusion and future work.
vendors have also started making provision for migrating
FPGA applications to Structured-ASIC. Altera provides Hard- II. R EFERENCE FPGA ARCHITECTURE
Copy [6] and Xilinx provides EasyPath [7]. The main idea is This section describes a reference FPGA architecture used
to perform prototyping, testing and even initial shipment of in this work. Different netlists are mapped on this architecture
a design on an FPGA; later it can be migrated to Structured- that is later reduced to an ASIF. The reference FPGA is a
978-1-4244-4377-2/09/$25.00 © 2009 IEEE 112 FPT 2009

mesh-based VPR-style (Versatile Place & Route) [11] architec-
ture. It contains Configurable Logic Blocks (CLBs) arranged
on a 2 dimensional grid. Each CLB contains one Look-Up-
Table with 4 inputs and 1 output (LUT-4), and one Flip-Flop
(FF). A CLB is surrounded by a single driver unidirectional
routing network [12]. The FPGA is divided into “tiles” that
are repeated horizontally and vertically to form a complete
FPGA. A single FPGA tile, surrounded by its neighbouring
tiles is shown in Figure 1. Each of the 4 inputs of a CLB
are connected to 4 adjacent routing channels. The output
pin of a CLB connects with the routing channel on its top
and right through the diagonal connections of the switch box
(highlighted in the bottom-left switch box shown in Figure 1).
A unidirectional disjoint switch box interconnects uniform
length routing tracks (or wires). The connectivity of the routing
channel with the input and output pins of a CLB, abbreviated
as Fc(in) and Fc(out), is set to be maximum at 1.0. The channel
width is varied according to the netlist requirement but remains
in multiples of 2 [12]. This work considers only homogeneous
FPGA architecture, i.e. CLBs of the same type. Heterogeneous
FPGA architecture with CLBs of different types or hard-blocks
like multiplier, adders and RAMS are not considered here and
are left for future work. Fig. 1. Reference FPGA Architecture
A circuit or netlist to be mapped on an FPGA is initially
transformed in the form of CLBs (LUTs and/or FF). A
netlist contains CLBs and IO instances that are interconnected routing of a netlist is not guaranteed, but not impossible either.
through signals called “nets”. A software module named The ASIF generation methods presented here discuss different
“placer” uses the simulated annealing algorithm [11] [13] to techniques in which netlists can be mapped on an FPGA so
place the CLBs/IO instances on the CLB/IO blocks of FPGA. that the final ASIF occupies the least possible area.
The placer tries to achieve a placement having a minimum sum An ASIF can be generated for any set of circuits that we
of half-perimeters of the bounding boxes of all the nets. The happen to need for a system; the circuits must not necessarily
bounding box (BBX) of a signal or net is a minimum rectangle belong to similar application domain. In this work, different
that contains the driver instance and all the receiving instances ASIF generation methodologies are explained with the help of
of a net. The placer moves an instance randomly from one 2 example circuits shown in Figure 3. Later in section V, the
block position to another. After each move operation the BBX same techniques will be applied on a larger set of netlists.
cost is updated incrementally. Depending on the cost value and Initially a target FPGA architecture is generated with the
the annealing temperature the move operation is accepted or maximum number of CLBs required by these two netlists.
rejected. After placement, a software module named “router” If the maximum number of CLBs is not a perfect square
routes the netlist on the architecture. The router uses pathfinder number, FPGA size is selected to be a nearest rectangle. Both
algorithm [11] to route the nets of a netlist on the FPGA the netlists are then routed on the target architecture with
routing resources. minimum channel width. Figure 2(a) and 2(b) show the 2
netlists placed and routed individually on the target FPGA.
III. ASIF G ENERATION Wires used by netlist-1 are continuous/blue, whereas wires
This section discusses various techniques for the generation used by netlist-2 are dotted/black. Grey wires are the unused
of an ASIF. An ASIF is an FPGA with reduced flexibility that wires in an FPGA. For simplification, Figure 2 does not show
can implement a set of circuits exclusively. It is generated by the inner connection details of switch box and connection box.
removing unused configurable routing resources of an FPGA.
First, a minimum FPGA architecture is defined to map any of A. ASIF-1 (No wire sharing)
the netlist belonging to the set of netlists. Next, each netlist After selection of FPGA size, all netlists are placed sepa-
is placed and routed on the FPGA. Finally all the routing rately on the FPGA. The CLB/IO instances of different netlists
switches that are not used by any of the given set of netlists share the CLB/IO blocks on the architecture. But routing of all
are removed from the FPGA. When switches are removed netlists is done without sharing routing wires over all netlists.
from FPGA, long wires are created. Buffers are added to cut Each netlist uses its own minimum routing channel network;
these long wires. It is worth noticing here that once an ASIF thus the target FPGA architecture contains the maximum
is generated, the placement and routing of the given set of number of CLBs required by any netlist, whereas the channel
netlists is fixed. Even a slightest change in the placement or width of the target FPGA is the sum of the channel widths
113
(a) Netlist-1 on FPGA (b) Netlist-2 on FPGA (c) ASIF-1 (No wire sharing) (d) ASIF-2 (Wire sharing)
Fig. 2. Example circuits mapped on FPGA
C. ASIF-3 (Efficient wire sharing)
The main motivation behind this method is to combine the

benefits of the previous two methods, i.e. less wires and less
switches. In this method the netlists are placed separately on
the FPGA. But routing is done efficiently in order to minimize
the total number of used switches and routing wires. This is
done by maximizing the shared switches required for routing
Fig. 3. Example Circuits all the netlists on the FPGA. The efficient wire sharing
encourages different netlists to route their nets on an FPGA
with maximum common routing paths. After all the netlists
are efficiently routed on FPGA, the unused switches in the
required by all netlists.
architecture are removed to generate an ASIF.
Figure 2(c) shows the FPGA architecture on which both
The pathfinder routing algorithm is modified to support effi-
example netlists are mapped with no wire sharing. It has a
cient wire sharing. Before we describe these changes in detail,
channel width of 4. Two routing channels are used by netlist-
a short introduction of the routing algorithm is presented here.
1 and the other two are used by netlist-2. Finally all unused
An FPGA routing network is represented by a graph with
switches in the switch box and connection box are removed
nodes connecting each other through edges; each routing wire
to generate an ASIF.
of the architecture is represented by a node, and connection
Since no routing wires are shared between any of the
between two wires is represented by an edge. When a netlist is
netlists, ASIF-1 does not require any wire multiplexing in the
routed on the FPGA routing graph, each net (i.e. connection of
switch box. Only the required switches of the connection box
a driver instance with its receivers) is routed using a congestion
are retained. However, the total number of switches of the
driven “Shortest Path” algorithm. Once all nets in a netlist are
connection box and the total routing wires increase as the
routed, one routing iteration is said to be completed. At the end
number of netlists increases. Consequently, the layout area
of an iteration, there can be conflicts between different nets
of ASIF-1 for large number of netlists might eventually be
sharing the same node; therefore the congestion parameters are
dominated by routing wires.
updated and iteration is repeated until the routing converges to
a feasible solution (i.e. no conflicts are found) or routing fails
(i.e. “maximum” iteration count is reached). Multiple netlists
B. ASIF-2 (Wire sharing) can be routed on the FPGA by allowing nodes to be shared
by multiple nets belonging to different netlists.
In this method, the netlists are placed and routed separately Figure 6 explains different cases through which efficient wire
on the FPGA. CLB/IO blocks on the architecture and the sharing mechanism can be implemented in the pathfinder
routing channel both are shared by all netlists. So the target algorithm. The graphical representation in Figure 6(a) shows
FPGA architecture contains maximum number of CLBs and a case in which two nodes occupied by nets of 2 different
maximum channel width required by any of the netlists. netlists drive the same node. Figure 6(b) shows a case in which
Figure 2(d) shows the FPGA architecture on which both the nodes occupied by nets of different netlists use different edges
example netlists are mapped with wire sharing. to drive different nodes. In Figure 6(c), both netlists share the
In this method the remaining switches in an ASIF belong to same node and edge to drive a same node. Finally, Figure 6(d)
a switch box or to a connection box. The total wires required shows a node shared by both netlists targets different nodes.
is relatively small when compared to the previous method; but In order to reduce the total number of switches and total wire
the number of switches increases considerably. requirement, the physical representation in Figure 6 suggests
114
Fig. 4. Node Cost Formula Fig. 5. Placement Cost Formula
that case (a) must be avoided because it increases the number netlist-2 because it is used by netlist-1 might not eventually
of switches (here a mux-2), whereas case (b), (c) & (d) should be used by netlist-1, but still remains in use by netlist-2. Thus
be favoured. Favouring these cases means that if more routing due to inaccurate routing information, both parallel methods
resources exist in FPGA architecture, it is more probable end up taking more switches and wires than the best “Netlist
to exploit such cases. For this reason, in order to create by Netlist” results.
more routing resources, section V performs experiments with All experimental results shown in section V use the maxi-
varying channel widths. mum value of “Factor”. The maximum routing iteration is set
The routing preferences shown in Figure 6 need to be to 30, and the value of “Factor” decreases if routing does not
integrated in the pathfinder algorithm. For this purpose the cost converge within the first 15 iterations.
function of a node is modified in a way similar to the timing-
driven cost function [11]. A particular routing is avoided or D. ASIF-4 (Efficient placement)
favoured by increasing or decreasing the cost of a node. If In all previous ASIF generation techniques, simulated an-
a net is to be routed from current node to next node, the nealing algorithm with bounding box (BBX) cost function is
cost of next node can be calculated with the formulas shown used to place each netlist individually. In other words, only
in Figure 4. The cost of a node depends on its congestion intra-netlist placement optimization is performed. No inter-
cost. Here, the increase or decrease in cost is controlled by a netlist placement optimization is done for the group of netlists
constant “Factor”. The value of this factor ranges between for which an ASIF is required. An inter-netlist placement
0 and 0.99. If an FPGA architecture has limited routing optimization can reduce the total number of switches required
resources, a maximum value of factor might not allow the in an ASIF. This optimization can be understood with an
routing algorithm to resolve all congestion problems. So the example shown in Figure 7. Figure 7(a) shows two very simple
value of factor is gradually decreased if the routing solution netlists; both have the minimum possible BBX placement
does not converge after a few routing iterations. cost. The ASIF for these two netlists requires 4 multiplexers
Multiple netlists can be routed on an FPGA architecture (mux-2). Figure 7(b) shows the same two netlists that are
either sequentially or parallelly. Sequential routing of netlists placed efficiently with the same BBX cost as for netlists in
in a particular order is called here as “Netlist by Netlist” Figure 7(a); the ASIF for these netlists requires no switches
routing. Each routed netlist saves the information that which at all.
nodes and edges it has used. Later, the next netlist uses this In order to perform efficient placement, a new cost function
information to perform efficient routing by giving preference is proposed, named here as the “Driver Count” (DC) cost
to some nodes over others. Experiments are done with netlists function. This cost function calculates the sum of driver
sequenced in different orderings (i.e. netlists ordered in as- blocks targeting the receiver blocks of the architecture over
cending or descending order according to their size, channel all netlists. Efficient placement tries to optimize both intra-
width, wire utilization and few random orders). An ASIF netlist and inter-netlist placement over a set of netlists.
generated with netlists routed sequentially in descending order The aim is to minimize the BBX cost of each netlist
of their channel widths or of the number of routing wires and the total number of driver blocks (CLB/IO) of the
used gives minimum area results. However the area difference architecture driving other blocks. In Figure 7(a) the ASIF has
between ASIFs (generated using different netlist orderings) a “Driver Count” cost of 8, where each block has 2 different
decreases as the routing resources increase. drivers. In Figure 7(b) the “Driver Count” cost is 4, with each
In order to get rid of the dependence on a particular order block having only one driver. Efficient placement uses both
of netlists, the netlists are routed parallelly. Two parallel cost functions in parallel (i.e. the bounding box (BBX) cost
techniques are tried; “Iteration by Iteration” (routing-iterations function and the Driver Count (DC) cost function). Since the
of different netlists are routed in a sequence) and “Net by BBX cost and the DC cost are not of the same magnitude,
Net”(nets of different netlists are routed in a sequence). But initially both costs are made comparable by multiplying one
both techniques give much worse results than the best “Netlist of them with a normalization factor. This factor is determined
by Netlist” ordering. This is because in parallel routing from the initial BBX and DC costs. Weighting coefficients are
techniques, the routing of all the netlists remains incomplete attributed to them and a new weighted cost is computed as
simultaneously. In order to avoid congestion, the nets keep on shown in Figure 5. The simulated annealing algorithm later
changing their path in different iterations. A path favoured by uses this new weighted cost. It should be noted that as the
115
Fig. 6. Efficient Wire Sharing Fig. 7. Efficient Placement
weightage for DC function is increased the DC cost decreases, an ASIF reduces only the routing area whereas the logic area
but the BBX cost increases, and vice-versa. With increase in remains constant; the next section shows the effectiveness of
the BBX cost, more routing switches are required to route few techniques by comparing the “Routing area” only.
a netlist, which in turn means that more area is required. A
compromise needs to be searched to obtain a good solution. V. E XPERIMENTATION AND A NALYSIS
Placement of multiple netlists is supported in the same In this work, ASIF generation techniques are applied on
way as used by cASIC [10]. With multiple netlist placements, a set of MCNC benchmarks shown in Table I in descending
each block of the architecture can allow mapping of mul- order of channel width requirement. Various experiments are
tiple instances belonging to different netlists. The placer is done on the first 17 netlists. The last 3 netlists are not included
also modified to support the “Driver Count” cost function. in the experimentation because the FPGA sizes of these netlists
All netlists are simultaneously placed. The placer randomly are much larger and the channel widths much smaller than
chooses an instance from any input netlist, and changes its those of some other benchmarks in the list. Their inclusion
position. The differences in the BBX Cost and the DC cost increases the overall area of the target FPGA thus giving
are updated incrementally. New weighted cost is calculated unnecessary area advantage to an ASIF. However for the sake
in accordance to the given weights; the simulated annealing of record, the area results of an ASIF for all the 20 MCNC
algorithm uses this cost to decide if the movement is accepted circuits are presented at end of this section.
or rejected. After all netlists are well placed, efficient routing is Figure 8(a) shows ASIF-4 routing area variation for
performed. All unused switches are then removed to generate 17 benchmarks. The X-axis shows variations in Bounding
an ASIF. Box/Driver Count weighing coefficients. The Y-axis shows
the routing area. The results are shown for different channel
IV. A REA M ODEL widths (between 18 and 192). It can be seen that the area
A generic area model is used to calculate the area taken decreases as the channel width increases. The best results are
by the FPGA and different ASIFs. The area model is based found with BBX/DC ratio of 80/20. This ratio is used for
on the reference FPGA architecture shown in Figure 1. Area ASIF-4 in all other experiments.
of SRAMS, multiplexers, buffers and flip-flops is taken from Figure 8(b) shows routing area for different ASIFs with
a symbolic standard cell library (SXLIB [14]) which works varying channel widths. It can be seen that the routing area
on unit Lambda(λ). When an ASIF is generated, all unused difference between ASIF-2 and all the other ASIFs is very
resources are removed. With the removal of switches, wires large. But on the other hand, Figure 8(c) shows that the number
are connected with one another to form long wires. A buffer of wires used by ASIF-2 are very low as compared to ASIF-
is added for every wire of length 8 (spanning 8 CLBs) and for 1. For ASIF-3 and ASIF-4, the routing area decreases and the
every 8 wires driven by an output wire of a block. The area of number of wires increase as the channel width increases. This
FPGA or ASIF is reported as the sum of the areas taken by the effect is mainly due to preference for the routing cases shown
switch box, connection box, buffers and CLBs. The area model in Figure 6. The routing wires can play a pivotal role in the
also reports the total number of routing wires used for routing area of ASIF if it dominates the logic area. This can happen if
all netlists. In the next section the term “Routing area” is used repeated tiles of an ASIF are designed in full-custom, or layout
for area taken by switch box, connection box and buffers. The is generated in smaller process technology. In such a case, an
term “Logic area” is used for area taken by “CLBs”. Since ASIF-4 with smaller channel widths can give a compromised
116
solution. TABLE I
Figure 8(d) shows percentage area distribution in FPGA and N ETLIST TABLE
ASIF. In an FPGA only 9.3% area is taken by logic area, Index Netlist Number FPGA Min Channel
whereas the remaining area is taken by the routing area. In Name of CLBs Size Width*
ASIFs, the routing area is decreased to an extent that CLB 1. pdc 4575 68x68 18
area occupies a very important percentage of the total area; in 2. ex5p 1064 33x33 16
ASIF-4 CLB takes 41% of area for a channel width of 192. 3. spla 3690 61x61 14
4. apex4 1262 36x36 14
These results suggest that significant overall gains might be 5. ex1010 4598 68x68 12
achieved if CLB is also optimized or atleast if the size of 6. frisc 3556 60x60 12
LUT is varied. 7. apex2 1878 44x44 12
Figure 8(e) compares FPGA and ASIF with changing num- 8. seq 1750 42x42 12
ber of netlists (the order in Table I is respected). The X-axis 9. misex3 1397 38x38 12
10. elliptic 3604 61x61 10
presents the number of netlists; where 1 means only “pdc” is 11. alu4 1522 40x40 10
used, 2 means “pdc” and “ex5p” are used, and so on. The Y- 12. des 1591 40x40 8
axis presents the number of times an ASIF is smaller than 13. s298 1931 44x44 8
the FPGA. Here ASIFs with maximum channel width are 14. bigkey 1707 42x42 8
compared with 68x68 LUT-4 based reference FPGA having 15. diffeq 1497 39x39 8
16. dsip 1370 38x38 6
channel width of 18. If area occupied by routing wires is 17. tseng 1047 33x33 6
not dominant, a LUT-4 based ASIF-4 is 4.39 times or 77.2% 18. clma 8383 92x92 12
smaller than a LUT-4 based 68x68 FPGA for 17 MCNC 19. s38584.1 6447 81x81 10
benchmark circuits. 20. s38417 6406 81x81 8
Figure 8(f) compares routing area of different ASIFs (for * Minimum routing channel width on an
FPGA sized 68x68 (except last 3 netlists)
maximum channel widths) averaged to ASIF-1. It should be
noted that ASIF-3 is slightly better than ASIF-1. This is
because of the efficient wire sharing which facilitates the
use of common switches and wires for several instances of overall area distribution of FPGAs and ASIF-4 with varying
different netlists that happen to be placed on the same blocks LUT sizes. The percentage area occupancy of CLBs in ASIFs
of the architecture and drive the same block. However, ASIF- has decreased from 42% (in case of LUT-4) to 32% (in case
1 uses separate switches and wires to connect them. Next, in of LUT-2). For the sake of record, the same experiments are
ASIF-4 both efficient placement and routing gives a gain up repeated for 20 MCNC benchmark circuits. It has been found
to 12% for 17 netlists. This gain of 4% for ASIF-3 and 12% that a LUT-2 based ASIF-4 for 20 MCNC benchmark circuits
for ASIF-4 over ASIF-1 is one of the minor benefits. The is 5.53 times smaller than LUT-4 based 92x92 FPGA with
major benefit of ASIF-4 is the possibility of compromised channel width of 16. It is to be noted here that the largest
area/wire-count solution that lies between ASIF-1 and ASIF- netlist “pdc” is routable with channel width of 16 when FPGA
2. For example in Figure 8(b), ASIF-4 area at channel width size increases from 68x68 to 92x92.
32, and in Figure 8(c) ASIF-4 wire count at channel width
32 gives a very good compromised solution over ASIF-1 at VI. C ONCLUSION AND F UTURE W ORK
channel width of 258; i.e. with only 6% increase in the routing This paper presented Application Specific Inflexible FPGAs
area compared to ASIF-1, 42% number of wires have been (ASIFs) that can implement a set of circuits which will operate
reduced. at mutually exclusive times. It has been shown that a LUT-2
In order to find the effect of LUT size on ASIF, exper- based ASIF is 5.43 times smaller than a LUT-4 based FPGA
iments are repeated with LUT-2, LUT-3, LUT-5 and LUT- for 17 MCNC netlists. The main idea is to design and test
6. It is found that the best BBX/DC weighting coefficient a set of circuits on an FPGA, and later reduce the FPGA to
ratio used for different LUT sizes is also 80/20. Figure 9(a) an ASIF for high volume production. The layout of an ASIF
compares the total area of ASIF-4 with several LUT sizes can be generated using a tile-based method as used for the
for 17 netlists. Figure 9(b) shows the wire count comparison layout generation of FPGA. However due to irregular routing
for ASIF-4 with varying LUT sizes for 17 netlists. It can be network of an ASIF, there can be large number of tiles with
seen that a LUT-2 based ASIF requires less area compared different sizes and characteristics. Consequently, the width of
to ASIFs with other LUT sizes. However the total number of the column of tiles can be reduced only to the maximum width
used wires increase linearly. Figure 9(c) presents the FPGA of any tile in that column; and the height of the row of tiles
and ASIF-4 comparison with various LUT sizes and varying can be reduced only to the maximum height of any tile in
number of netlists with maximum channel width for ASIF-4. that row. In the future we intend to generate the layout of
If the area occupied by wires is not dominant, then a LUT-2 an ASIF and we intend to look towards minimizing the total
based ASIF-4 is found to be 5.43 times or 81.5% smaller a area losses incurred due to irregular routing network. Besides,
LUT-4 based 68x68 FPGA for 17 MCNC benchmark circuits the hardware description of an ASIF can be integrated as an
having channel width of 18. And finally Figure 9(d) shows the embedded module into larger designs. Such an ASIF can be
117
(a) ASIF-4 for 17 netlists with varying channel widths (b) Routing area comparison for different ASIFs
and BBX/DC weightage coefficients for 17 netlists with varying channel widhts
(c) Wire count comparison for different ASIFs (d) Percentage area distribution for
for 17 netlists with varying channel widths FPGAs and ASIFs
(e) FPGA vs. ASIFs with varying (f) ASIF comparison (normalized to ASIF-1)
number of netlists with varying number of netlists
Fig. 8. Experimental Results
called as an embedded ASIF (eASIF). solutions. So we would like to explore different mechanisms
We also intend to try mapping new circuits on an ASIF that to find out with certitude whether a new circuit can be mapped
is not optimized to support them. Since an ASIF has limited on an ASIF or not, and with what amount of computing time.
routing flexibility when compared to an FPGA, mapping of This work considered only the area optimization of ASIF.
new circuits on an ASIF is not guaranteed. For a particular In the future we intend to work on timing analysis as well.
placement, there might not be sufficient routing paths to We would also like to compare the area of an ASIF for a
route all the signals of a new circuit on an ASIF. Even if given set of circuits, with the area of a combined ASIC for
few routing solutions exist, the currently used heuristic-based the same set of circuits. An ASIF containing hard-blocks like
routing algorithms do not guarantee to find a routing solution multipliers and adders is also a future direction of our work.
from a solution space that contains only one or few routing The inclusion of these hard-blocks will help reduce the area
118
(a) Total area comparison for ASIF-4 with varying LUT (b) Wire count comparison for ASIF-4 with varying LUT
sizes and channel widths for 17 netlists sizes and channel widths for 17 netlists
(c) FPGA vs. ASIF-4 with varying LUT (d) Percentage area distribution for FPGAs
sizes and number of netlists and ASIF-4 with varying LUT sizes
Fig. 9. Effect of LUT size on ASIF
gap between ASIF and sum of ASICs. [14] A. Greiner and F. Pecheux, “Alliance: A complete set of cad tools for
teaching vlsi design,” 3rd Eurochip Workshop, pp. 230–237, September
1992.
R EFERENCES
[1] I. Kuon and J. Rose, “Measuring the Gap Between FPGAs and ASICs,”
FPGA’06, pp. 21–30, Februray 2006.
[2] K. Wu and Y. Tsai, “Structured ASIC, Evolution or Revolution,” Proc.
ISPD, pp. 103–106, April 2004.
[3] T. Okamoto, T. Kimoto, and N. Maeda, “Design Methodology and Tools
for NEC Electronics Structured ASIC,” Proc. ISPD, pp. 90–96, April
2004.
[4] D. Sherlekar, “Design considerations for Regular Fabrics,” Proc. ISPD,
pp. 97–102, April 2004.
[5] eASIC, “www.easic.com.”
[6] Altera, “www.altera.com.”
[7] Xilinx, “www.xilinx.com.”
[8] M. Hutton, R. Yuan, J. Schleicher, G. Baeckler, S. Cheung, K. Chua,
and H. Phoon, “A Methodology for FPGA to Structured-ASIC Synthesis
and Verification,” DATE, vol. 2, pp. 64–69, March 2006.
[9] J. Pistorius, M. Hutton, J. Schleicher, M. Iotov, E. Julias, and K. Thar-
malignam, “Equivalence Verification of FPGA and Structured ASIC
Implementations,” FPL’07, pp. 423–428, August 2007.
[10] K. Compton and S. Hauck, “Automatic Design of Area-Efficient Con-
figurable ASIC Cores,” IEEE Transaction on Computers, vol. 56, no. 5,
pp. 662–672, May 2007.
[11] V. Betz, A. Marquardt, and J. Rose, Architecture and CAD for Deep-
Submicron FPGAs, January 1999.
[12] G. Lemieux, E. Lee, M. Tom, and A. Yu, “Directional and Single-Driver
Wires in FPGA Interconnect,” ICFPT, 2004.
[13] Kirkpatrick, Gelatt, and Hecchi, “Optimisation by Simulated Annealing,”
Science, vol. 220, no. 4598, pp. 671–680, May 1983.
119

ASIF: Application Specific Inflexible FPGA: Husain Parvez, Zied Marrakchi, Habib Mehrez

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ASIF: Application Specific Inflexible FPGA: Husain Parvez, Zied Marrakchi, Habib Mehrez

Uploaded by

Copyright:

Available Formats

ASIF: Application Specific Inflexible FPGA

Husain Parvez, Zied Marrakchi, Habib Mehrez

978-1-4244-4377-2/09/$25.00 © 2009 IEEE 112 FPT 2009

C. ASIF-3 (Efficient wire sharing)

The main motivation behind this method is to combine the

You might also like