Professional Documents
Culture Documents
Lenart
Lenart
BF
BF
BF
BF
BF
BF
BF
BF
Memory
Memory
di
Hologram
addr
Control
Control &&twiddle
twiddlefactor
factor ROM
ROM Stage 1 Stage 2 Stage 3 Stage 4
Data scaling in pipelined architectures Data scaling - Convergent block floating point
The problem with a fixed point FFT is to a) b)
Existing approach:
20 # output bits from the butterfly
Large intermediate buffer
maintain the accuracy and preserve the # exponent bits
The basic idea is that the output from a radix-4 stage is a
dynamic range at the same time. One W(n)
set of 4 independent groups that can use different scale
15
way to achieve a high signal-to-noise factors. This will converge in the pipeline towards one
Trunc
Delay
Trunc
10 10
Delay
ratio is to increase the internal word exponent for each output sample from a pipelined FFT. 4log ( N ) −stage
4
12 output bits
10 input bits
10 input bits
Stage 2
Stage 3
Stage 4
Stage 5
Stage 6
Stage 1
Stage 2
Stage 3
Stage 4
Stage 5
Stage 6
decoder
Final decoder
uses exponents, or scaling factors, for internal representation to improve the SNR. The
CBFP
CBFP
CBFP
CBFP
CBFP
CBFP
CBFP
CBFP
CBFP
CBFP
Stage
Stage11 Stage
Stage22 Stage
Stage 33 Stage
Stage 44 Stage
Stage 55 Stage
Stage66
exponents are usually shared between the real and imaginary part of a complex value,
Final
(radix-2)
or even shared among a set of complex values, unlike normal floating point
representation. Delay
Delay Delay
Delay Delay
Delay Delay
Delay
1 RAM
32x32
32x32
MUL
MUL
MUL
MUL
MUL
1
MBF
MBF
MBF
MBF
MBF
MBF
MBF
MBF
MBF
MBF
MUL
MUL
MUL
MUL
MUL
IBF
IBF
Single
Singleport
portRAM
RAM 15:0 15:0
32x16
32x16 ADDR nWE 0.8
0.6
Address
Address counter
counter Address
Addresscounter
counter 0.4
IBF MUL MBF
0.2
FIFO
FIFO FIFO
FIFO
FIFO
FIFO 0
Different approaches:
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Normalizer
exponent exponent
Normalizer
Equalizer
Equalizer
Equalizer
Butterfly
Butterfly
(BF)
(BF) BF
BF BF
BF 1. Dual port memory connected to an address generator. This allows simultaneous reads
W(n) -j-j
and writes to any memory location. Requires larger area than single port memories.
Normal butterfly unit Complex multiplier
and normalizing unit
Modified butterfly using equalizers to align the
data. FIFO’s with scale factor representation
2. Two single port memories, alternating between reading and writing every clock cycle
(a). The drawback is duplication of the address logic when using two memories
The proposed solution is to rescale data on the fly and minimize the memory instead of one.
requirements. No changes are required of the complex adders/multipliers due to the 3. Only one single port memory but with double word length (b). This is possible, due
simple scaling logic using normalizers and equalizing units. The scaling logic uses only to the consecutive addressing scheme used in the delay feedback. In addition to
one scale factor for representing both the real and imaginary part of a complex value. removing the duplicated address logic, the total number of memories for placement
will be reduced.
ratio. One of the reasons that CBFP is not capable of keeping a constant SNR is that there 140 Convergent block floating point
Our approach
Our approach using RDL
10 Convergent block floating point
Our approach
120 9
is usually no CBFP logic in the first radix-2 stage due to the high memory requirements.
Kbits of memory
60
2
Size in mm
100 8
50
% of resources
80 7 40
1 60
60 6 30
40 5 20
50
20 4 10
points points 0
0 3 FIFO memory Complex MUL + ROM Butterfly Equalizer Normalizer
40 512 1024 2048 4096 512 1024 2048 4096
SNR (dB)
0 30
20
Conclusion
Constant data path
Variable data path
An FFT processor using a novel data scaling approach that can operate on a wide range
of input signals, keeping the SNR at a constant level, has been presented. Compared to
10 Convergent block floating point
Our approach without VDL
Our approach with VDL
-1 0
1 1/2 1/4 1/8 1/16 1/32 1/64
block scaling approaches, such as CBFP, the proposed design requires significantly
smaller chip area due to the reduced memory requirements. At the same time it is
input amplitude
Arbitrary input signals with different amplitudes SNR for different amplitudes
capable of producing a higher SNR since scaling is applied in all pipeline stages.