You are on page 1of 27

Fast Fourier Transform:

Theory and Algorithms

Lecture 8
Vladimir Stojanovi

6.973 Communication System Design Spring 2006


Massachusetts Institute of Technology

Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Discrete Fourier Transform A review

Definition

{Xk} is periodic

Since {Xk} is sampled, {xn} must also be periodic

From a physical point of view, both are repeated with period N Requires O(N2) operations
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Fast Fourier Transform History

Twiddle factor FFTs (non-coprime sub-lengths)

1805 Gauss

Predates even Fouriers work on transforms!

1903 Runge
1965 Cooley-Tukey
1984 Duhamel-Vetterli (split-radix FFT)
1960 Goods mapping

FFTs w/o twiddle factors (coprime sub-lengths)

application of Chinese Remainder Theorem ~100 A.D.

1976 Rader prime length FFT 1976 Winograds Fourier Transform (WFTA) 1977 Kolba and Parks (Prime Factor Algorithm PFA)
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Divide and conquer

Divide and conquer always has less computations


Suppose all Il sets have same number of elements N1 so, N=N1*N2, r=N2 Each inner-most sum takes N12 multiplications The outer sum will need N2 multiplications per output point N2*N for the whole sum (for all output points)

Hence, total number of multiplications

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Generalizations

The inner-most sum has to represent a DFT

Only possible if the subset (possibly permuted)

Has the same periodicity as the initial sequence

All main classes of FFTs can be cast in the above form No permutations (i.e. twiddle factors) All the subsets have same number of elements m=N/r

Both sums have same periodicity (Goods mapping)


(m,r)=1 i.e. m and r are coprime

If not, then inner sum is one stap of radix-r FFT If r=3, subsets with N/2, N/4 and N/4 elements

Split-radix algorithm

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

The cost of mapping

The goal for divide and conquer

Different types balance mapping with subproblem cost E.g. in radix-2


subproblems are trivial (only sum and differences) Mapping requires twiddle factors (large number of multiplies)

E.g. in prime-factor algorithm


Subproblems are DFTs with coprime lengths (costly) Mapping trivial (no arithmetic operations)
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

FFTs with twiddle factors

Reintroduced by Cooley-Tukey 65

Start from general divide and conquer Keep periodicity compatible with periodicity of the input sequence Use decimation

almost N1 DFTs of size N2

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Cooley-Tukey FFT contd.

can be taken mod N2

Yn1 ,k = Yn1 ,k2

1. N1 DFTs of length N2

2. N multplications with twiddle factors

3. N2 DFTs of length N1

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Step 1: Evaluate N1 DFTs of length N2 Step 2: N multiplications with twiddle factors Step 3: Evaluate N2 DFTs of length N1 Vector xi mapped to matrix xn1,n2 (N1xN2) Compute N1 DFTs of length N2 on each row Point-to-point multiply with twiddle factors Compute N2 DFTs of length N1 on the columns

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

2-D view of Cooley-Tukey mapping

N=15 (N1=3, N2=5)

x0 x1............... x0 x3 x6 x9 x12 x1 x4 x7 x10 x13 x2 x5 x8 x11 x14 x12 x9 x6 x3 x0 x4 x1 x5 x2 x2 x8 x7 x13 x10 x14 x11 x0 x4 x1 x5 DFT-5 DFT-5 x3 x6 x9 x12 DFT-3 DFT-3 DFT-3 x4 x9 x14 x13 x12 x11 x10 x13 x14

jm

DFT-5

DFT-3 DFT-3 x0 x5

Cannot exchange the order of DFTs


Figure by MIT OpenCourseWare.

Because of twiddle multiply Different mapping for N1=5, N2=3

x0 x1 x2 x3 x4

x5 x10 x6 x11 x7 x12 x8 x13 x9 x14

Not just transpose

Figure by MIT OpenCourseWare.

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

10

Radix 2 and radix 4 algorithms

Lengths as powers of 2 or 4 are most popular Assume N=2n

N1=2, N2=2n-1 (divides input sequence into even and odd samples decimation in time DIT)

Butterfly
(sum or difference followed or
preceeded by a twiddle factor multiply)

Xm and XN/2+m outputs of N/2 2-pt DFTs on outputs of 2, N/2-pt DFTs weighted with twiddle factors
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

11

DIT radix-2 implementations

Several different ways

Reorder the input data

X0 DFT X1 X2 X3 X4 DFT X5 X6 X7 N=4 w1 8 w2


8

DFT N=2 N=4

X0 X4 X1 X5 X2 X6 X3 X7

Input samples for inner DFTs in subsequent locations Results in bit-reversed input, in-order output DIT

{x2i}

DFT N=2

Selectively compute DFTs on evens and odds


DFT N=2

{x2i+1}

DFT N=2

Perform in-place computation Output in bit-reversed order (X3 in position six (011->110))

w3 8

Division DFT of into even and N / 2 odd numbered sequences

Multiplication by twiddle factors

DFT of 2

Figure by MIT OpenCourseWare.

Which type is this implementation?


12

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Decimation in frequency (DIF) radix-2 implementation

If reverse the role of N1 and N2, get DIF

N1=N/2, N2=2

X0 X1 X2k1 = N / 2-1 n1=0 W N1/ 1 2(xn1+xN / 2+n1),


n k n k

DFT N=2 DFT N=2 DFT N=2 DFT N=2 w3


8

DFT N=4

X0 X2 X4 X6

X2 X3 X4 X5 X6 X7

{x2k}
w1 8 DFT w2
8

X2k1+1 =

N / 2-1

X1 X3 X5 X7

n1 W N1/ 1 2 WN (xn1-xN / 2+n1). n1=0

N=4

{x2k+1}
DFT of N/2

DFT of Multiplication 2 by twiddle factors

6.973 Communication System Design

Figure by MIT OpenCourseWare.

13

Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Duality DIT<->DIF

X0 DFT X1 X2 X3 X4 DFT X5 X6 X7 N=4 w1 8 w2


8

DFT N=2 N=4

X0 X4 X1 X5 X2 X6 X3 X7

X0 X1 X2 X3 X4 X5 X6 X7

DFT N=2

X0 DFT N=4 X2 X4 X6 X1 X3 X5 X7 DFT of N/2

{x2i}

DFT N=2

DFT N=2 w1
8

{ }
x2k DFT w2
8

DFT N=2

DFT N=2

N=4

{x2i+1}

DFT N=2

DFT N=2 w3
8

{ }
x2k+1

w3 8

DFT of Multiplication Division by twiddle into even and N / 2 factors odd numbered sequences

DFT of 2

DFT of Multiplication 2 by twiddle factors

. are. Figure by MIT OpenCourseW

Which one is DIT (DIF)? How can we get one from another?
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

14

Complexity of radix-2 FFTs

DFT of length N replaced by two length-N/2

At the cost of N complex multiplications (twiddle)

And N complex additions (2pt DFTs)

Iterate the scheme log2N-1 times

Obtain trivial transforms (length 2) of the length-N/2 DFTs

Twiddle multiplies (WNi)


Complex multiply 3 real mult + 3 real add If i is multiple of N/4, no arithmetic operation required (why?)
4 butterflies (one general, 3 special cases)

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

15

Radix-4

N=4n, N1=4, N2=N/4


x0
DFT 2 DFT 8

X0 X14 X1
DFT 8

4 DFTs of length N/4 3N/4 twiddle multiplies N/4 DFTs of length 4 No multiplication Only 16 real additions

x8

x15 x0 x4 x8 x12 x15


DFT 4

X15 X0 X12 X1 X13 X2 X14 X3 X15

Cost of length-4 DFT


DFT 4 DFT 4 DFT 4 DFT 4

Reduces the number of stages to log4N

Figure by MIT OpenCourseWare.

Radix-8 can reduce number of operations even more


6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

16

Mixed-radix and Split-radix

Mixed-radix

Diferent radices in different stages Different radices in the same stage

x0
DFT 2 DFT 8

X0 X14 X1
DFT 8

R2

Split-radix

x8

x15 x0 x4 x8 x12 x15 x0 x4 x8 x12


DFT 2/4 DFT 8 DFT 4 DFT 4 DFT 4

X15 X0 X12 X1 X13 X2 X14 X3 X15 X0

DFT 4 DFT 4 DFT 4 DFT 4

Simultaneously on different parts of the transform

R4

Can achieve lowest number of adds and multiplies for length 2n inputs

X14 X1 X13 X3 X15

S-R

Figure by MIT OpenCourseWare.

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

17

Split-radix (DIF SRFFT)

Look at DIF radix-2

X0 X1 X2

X2k1 dont have twiddles


X2k1 = N / 2-1 n1=0
nk W N1 / 12(xn +xN / 2+n ), 1 1

DFT N=2 DFT N=2 DFT N=2 DFT N=2


3 w8

DFT N=4

X0 X2 X4 X6 X2k = 1 N / 2-1 n1=0

{x2k}
w1
8

X3 X4 X5 X6 X7

W N1/ 1 2(xn1+xN / 2+n1),


n1 W N1/ 1 4W N nk

nk

X2k +1 = 1

N / 2-1 n1=0

nk n1 W N1/ 1 2 W N (xn1-xN / 2+n1).

DFT w2
8

X1 X3 X5 X7

N=4

X4k +1 = 1

N / 4-1 n1=0

{x2k+1}
DFT of N/2

X[(xn -xN / 2+n ) 1 1 +j(xn1+N / 4-xn1+3N / 4)], X4k +3 = 1 N / 4-1 n1=0 W


n1k1 3n 1 N / 4WN

DFT of Multiplication 2 by twiddle factors

Even samples X2k in DIF should be computed separately from other samples

Figure by MIT OpenCourseWare.


X[(xn +xN / 2+n ) 1 1 -j(xn1+N / 4-xn1+3N / 4)]. X0 X4 X8 X12 DFT 2/4 DFT 8 DFT 4 DFT 4 X14 X1 X13 X3 X15 X0

With same algorithm (recursively) as the original sequence Radix-4 is more efficient than radix-2 Higher radices are inefficient

No general rule for odd samples


Figure by MIT OpenCourseWare.

Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Split-radix (DIT SRFFT)

Dual to DIF SRFFT

Considers separately subsets {x2i}, {x4i+1} and {x4i+3}

Redundancy in Xk, Xk+N/4, Xk+N/2, Xk+3N/4 computation

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

19

FFTs without twiddle factors

Divide and conquer requirements

N-long DFT computed from DFTs with lengths that are factors of N (allows the inner sum to be a DFT)

Provided that subsets Il guarantee periodic xi

When N factors into co-prime factors N=N1*N2

Starting from any xi form subset with compatible periodicity (the periodicity of the subset divides the periodicity of the set)

{xi + N1n2 | n2 = 1,..., N 2 1} or {


xi + N2 n1 | n1 = 1,..., N1 1} Both subsets have only one common point xi Allows a rearrangement of the input (periodic) vector into a matrix with a periodicity in both dimensions (rows and columns), both periodicities being comatible with the initial one
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

20

Goods mapping

FFTs without twiddle factors all based on the same mapping

Turns original transform into a set of small DFTs with coprime lengths
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

{xi + N1n2 | n2 = 1,..., N 2 1} or {xi + N2 n1 | n1 = 1,..., N1 1}

equivalent to

Good's mapping

0 5

9 12 4 7

8 11 14 2

10 13 1

This mapping is one-to-one if N1 and N2 are coprime All congruences modulo N1 obtained

. are. Figure by MIT OpenCourseW

For a given congruence modulo N2 and vice versa


6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

21

Just another arrangement of CRT

Chinese Remainder Theorem (CRT)

If we know the residue of some number k modulo two coprime numbers N1 and N2 k N k N It is possible to reconstruct k N N Let k N = k1 k N = k2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Then k N N = N1t1k2 + N 2t2 k1
N
2
1 2
1 2

t1 N1

N2

= 1 and t2 N 2

N1

= 1

Good's mapping
0 5 3 6 9 12 4 7 8 11 14 2 10 13 1

t1 multiplicative inverse of N1 mod N2


t2 multiplicative inverse of N2 mod N1
t1, t2 always exist since N1, N2 coprime (N1,N2)=1
What are t1, t2 for N1=3, N2=5?

CRT mapping

6 12 3

10 1 7 13 4 5 11 2 8 14
. are. Figure by MIT OpenCourseW

Reversing N1 and N2

Results in transposed mapping

6.973 22

Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Impact on DFT

Formulating the true multi-dimensional transform


k
N1 N 2

= N1t1k2 + N 2t2 k1

Xk = xi WNik , k = 0,..., N - 1,
i=0 N1 - 1 N2 - 1

N-1

XN1t1k2 + N2r2k1 = W = WN1 XN1t1k2 + N2t2k1 =


N2 N

n1 = 0 n2 = 0

xn1N2 + n2N1 W =W
(N2t2)N 1 N1

(n1N2 + N1n2)(N1t1k2+N2t2k1) N
x3 x0

x6

x9

x12 DFT 5

X9 X4 X14

N2t2 N1

= WN1
n k n k

x5 x10

N1 - 1 N2 - 1

DFT 3 X5

X2 X11

X8

n1 = 0 n 2 = 0

xn1N2 + n2N1 WN11 2 WN22 2 , x'n1.n2 = xn1N2 + n2N1


n k n k

X'k1k2 = XN1t1k2 + N2t2k1 X'k1k2 =


N1 - 1 N2 - 1

Figure by MIT OpenCourseWare.

n1 = 0 n2 = 0

1 1 x'n1n2 WN1 WN22 2

Figure by MIT OCW.

True bidimensional transform!


(no extra twiddle factors)

6.973 Communication System Design 23

Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Using convolution to compute DFTs

All sub DFTs are prime length

Rader showed that prime-length DFTs can be


computed as a result of cyclic convolution
E.g. length 5 DFT

Permute last two rows and columns

Cyclic correlation
(a convolution with a reversed sequence)

This is a general result

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

24

Example

Results in smallest number of multiplies

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

25

Prime Factor Algorithm

X = F1 x F2T.

Good's mapping
x12
DFT 5

CRT mapping
X9 X4 X14

x9 x3 x0 x5 x10 x6

DFT 3

X2 X5 X11

X8

Figure by MIT OpenCourseWare.

Efficient small DFTs are a key to the feasibility of this algorithm

6.973 Communication System Design


Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

26

Winograds Fourier Transform Algorithm

x10 x0 x3 x6 x9 x12
input additions
N=3

x5

X5 X11 X2 X8 X4 X14

X9
input additions point wise output additions output additions multiplication N=5 N=5 N=3

Figure by MIT OpenCourseWare.

B1xB2 only involves additions D diagonal (so point multiply) Winograd transform has many more additions than twiddle FFTs
6.973 Communication System Design
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

27

You might also like