Professional Documents
Culture Documents
USING FPGAs
Saniay Attri, Electronics & Commn Engg.Deptt., Technical Teachers' Training Institute, Sector-26,
Chandigarh, India. Tel: +91-0172-794349, E-mail: sattri@yahoo.com
B.S. Sohi, Electronics & Cornmn Engg.Deptt., Technical Teachers' Training Institute, Sector-26,
Chandigarh, India. Tel: +91-0172-791349, E-mail: bssohi@yahoo.com
Y.C. Chopra, Electronics & Commn Engg. Deptt., BBSB Engineering College, Fatehgath Sahib,
Punjab, India. Tel: +91-0172-791349, E-mail: ycchopra@?yahoo.com
462
Re-configurable systems can be taken up. A filter is used to remove
realized using the re- unwanted portions from a stream of
programmability of the FPGAs. data. FIR filters are common
DSP and ASIC based systems can components in many DSP systems and
be fast prototyped, different design are used to perform signal pre-
options can be emulated, and long conditioning, anti-aliasing, band
simulations can be avoided. selection, interpolation, low-pass filtering
Off-chip interconnections and etc. The advantages of the FIR filter
external components like FlFOs or include guaranteed stability for all
RAMS can be integrated in the realizable filter coefficient values,
embedded applications. absence of overflow oscillations and the
Hard-wired DSP cores can be ability to implement filters with linear
simplified and optimized for a given phase response. There are several
application. basic structures of FIR filters such as
canonical, pipelined and inverted form.
The FPGAs have the ability to In FIR filter applications, arithmetic
implement a DSP function using one of elements for operations such as
several techniques, which depend on addition, multiplication and delay are
the performance required. These commonly required[6], [a. These
techniques can be used to optimize the arithmetic circuits can be designed and
implementation of many different types implemented using common sub-circuit
of data processing or MAC-based building blocks.
techniques. In the areas where the
speed of conventional bit parallel circuits 3.1 16 tap, 8 bit FIR filter design
is not needed, techniques based on
distributed arithmetic (DA) can be used The response of a K tap FIR filter
141. Parallel Distributed Arithmetic (PDA) can be expressed as the following sum
techniques are used to achieve the of products:
fastest sample rates, while lower rates
can be sustained with Serial Distributed
Arithmetic (SDA) techniques that use k=l
less FPGA resources (i.e. configurable
logic blocks (CLBs)). In addition, a serial Where y(n) is the response at time n,
stream of data matches better with the xk(n) is the k th prior input data at time n
structure of an FPGA. Thus, in an actual and the Ak are the coefficients of the
implementation, the speed of a full serial filter. Each term, when expanded,
circuit is not N times lower than the involves only one bit of the input data
equivalent N-bit parallel approach [5]. with all the bits of the coefficients. This
The primary design concern is the allows constructing a look-up table that
performance or the sample rate of the can be addressed by the same bit of all
filter and the design must work at the input variables. This look-up table holds
desired sample rate without over- all the additive combinations. Figure 1
consuming the resources. The designs gives the data flow diagram for a 16-tap,
thus obtained can be verified and built 8-bit FIR filter that is based on
into a core for use with various DSP distributed arithmetic [8]. The filter
systems. consists of the following seven major
components:
3. FIR filters
0 a parallel to serial converter,
In this paper design of FIR filter, that 0 a RAM-based shift register ,
is a fit case for being built into a core, is
463
a serial adder, adders are presented to the lookup
a Look-Up Table (LUT), tables. Since the coefficients are
a complementing register, symmetric, the sums generated by the
an adder and adders can be multiplied by the same
a scaling accumulator. coefficients. Since all possible partial
products are pre-computed, the outputs
of the serial adders are used to address
] the lookup tables to generate the
appropriate multiplication results. The
outputs of the registered lookup tables
are summed, except for the sign result
that is complemented before being
summed. The registered summation is
then fed into a scaling accumulator.
Figure 2 suggests that the number of
MACs for a 16 Tap, 8 Bit FIR Filter
should be four. However, the number of
Figure 1. Data Flow diagram of a 16 Tap, 8 MACs reduces by a factor of two if we
bit FIR Filter consider the filter to be symmetrical [9].
An 8-bit data sample is loaded into
A
the parallel to serial converter (PSC) at
the sample rate. The PSC generates a
serial output stream that is supplied to
the RAM-based shift registers at the bit
clock rate. The bit clock rate is
f %
__
S-REG
LOOK. '
determined by C
UP
bit clock rate = (n + 1)wample rate I t
__
S-REG
c
'4
I TABLE
where (n+l) represents the number of ' D --
data bits per sample plus an overflow
bit.
.
...
464
optimization and it gave an estimated that one stores the even-bits and the
clock speed of 33.48 MHz. These other stores the odd-bits. The 2-bit
results show the trade-off between parallel data samples require twice the
silicon area and speed of the number of LUTs. There is also the
implementation. Thus a filter design addition of a 1-bit scaling adder,
implemented in an FPGA with SDA required to add the two partial sums,
gives a significant amount of which results from each of the two
performance in a modest number of parallel sample bits. The scaling
CLBs i.e. 68. SDA uses the smallest accumulator’s input bus is expanded to
number of CLBs while processing all accommodate the larger partial sum and
data samples (TAPS) in parallel. the final scaling accumulator is changed
from a 1- to a 2-bit shift for scaling. A
Parallel Distributed Arithmetic four product MAC with a digit-size of two
is shown in Figure 3.
Parallel Distributed Arithmetic (PDA)
is used to increase the overall * BITS[(~.I),...,5,3.ij
465
same function with SDA. 2-Bit PDA uses range goes up to 6 MHz using the bit-
more number of CLBs than that of SDA, serial approach optimized for speed and
while still processing all data samples more than 50MHz with bit-parallel one.
(TAPS) in parallel, at twice the SDA All filters have been automatically
data sample rate. The number of bits implemented using Synopsis Workview
being processed during each clock cycle office tools. Further, hand optimisation is
can be increased until a BDA known to yield still better results in most
implementation with digit size n is of these cases.
reached, for n-bit data samples. When In order to tune a filter in a system,
the design is an n-Bit PDA, the sample or even have multiple filter settings, the
data rate is at a maximum. SRAM technology of the XC4000E can
be exploited by reconfiguring the
With FDA, each additional parallel parameterized part.
bit requires an additional level of scaling
(by powers of 2) and summation for References
each partial product pair of bits. The
LUTs for SDA and PDA can always be 1. Tiong Jiu Ding, John V. McCanny,
the same for any given 4-MAC block, Fellow, IEEE, and Yi Hu, "Rapid Design of
regardless of the number of bits in the Application Specific FFT cores", IEEE Signal
Processing Transactions, Vol 47, No 5, pp 1371-
sample data. This is true for PDA only if 1390, May 1999.
common bit-weighted sample inputs are 2. Altera Corporation, USA, Conference
used to address the LUT. paper, "Improving Fixed-point DSP Processor
System Performance with PLDs as a DSP
It may be necessary to tune a filter in Processor ".
3. Xilinx Inc., " The Programmable Logic
a system, or even have multiple filter Data Book ", 1999.
settings. Here the SRAM technology of 4. S.A. White, "Applications of distributed
the XC4000E can be exploited by arithmetic to digital signal processing" IEEE
reconfiguring the parameterized part. ASSP magazine, Vol. 6, no.3, pp 4-19, July 1989.
5. Atmel Corporation, USA, Application
The changes to the filter lie in the Note, "FPGA-Based Signal Processing Using Bit
coefficients with the actual structure of Serial Digital Signal Processing", September,
the design remaining unchanged. These 1999.
coefficients are stored as partial 6. Actel Corporation, USA, Application
note, "Designing FIR Filters with Actel FPGAs".
products. In case, it is desired to change 7. Mintzer, L., "FIR Filters with Xilinx
the filter characteristics, it can be FPGA", FPGA92 ACMlSlGDA workshop on
achieved by simply altering the VHDL FPGAs pp 129-134.
fife that contains the coefficients for the 8. Gregory Ray Gosh, Program manager,
Xilinx Corporation, Application notes, "Using
desired filter. Xilinx FPGAs to design Custom Digital Signal
Processing", Nov. 2000.
Conclusion 9. Lucent Technologies, Application note,
"Parameterized FIR Filters In ORCA Field
A study of SDA and PDA FIR filters Programmable Gate Arrays" September 1996.
10. User's Manual, Work View Office
with programmable coefficients is Software tool to design customized digital circuits
presented. The design methodology on FPGAs.
using each of these structures is 11. Javier Valls, Marcos M. Peiro, Trini
detailed and finally the results of their Sansaloni, Eduardo Boemo. "A Study About
FPGA-Based Digital Filters", Proc. 1998 IEEE
implementation in Xilinx XC4000E SIPS, pp.191-201, Boston, Oct.1998.
FPGA are given. The results of 2-bit 12. Jean-Michel Raczinski, Stephane
PDA are more efficient in area-time Sladek, Luc Chevalier, "Filter Implementation on
product parameter. The throughput SYNTHUP", Proceedings of the 2nd COST G-6
achieved lets the filters be used in Workshop on Digital Audio Effects (DAFxgS),
NTNU, Trondheim, December 911,1999.
applications where the sample rate
466