You are on page 1of 8

# International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.

com

## Generation of 1D and 2D FFT function in MATLAB

Parul Goyal Department of Electronics & Communication Engineering, Uttaranchal Institute of Technology, Dehradun - 248001, Uttarakhand, India Email parulgoyal1973@gmail.com

AbstractThis paper proposes the generation of the code for the algorithm of 1D and 2D FFT and the methods for the recognition of faces using various methods of Joint Transform Correlation techniques. Various codes were written in MATLAB for the correlation and recognition of various images. Comparison between DFT and FFT Computation Speeds is made in this paper From results it is evident that for a couple of samples of various lengths the time taken by normal DFT is higher than the time taken by FFT. Thus the speed of FFT is beneficial when large calculations are required. I. Introduction: FFT is simply an algorithm (i.e., a particular method of performing a series of computations) that can compute the discrete Fourier transform much more rapidly than other available algorithms. For this reason, our discussion of the FFT addresses only the computational aspect of the algorithm. A DFT decomposes a sequence of values into components of different frequencies. This operation is useful in many fields (see discrete Fourier transform for properties and applications of the transform) but computing it directly from the definition is often too slow to be practical. An FFT is a way to compute the same result more quickly: computing a DFT of N points in the obvious way, using the definition, takes O(N 2 ) arithmetical operations, while an FFT can compute the same result in only O(N log N) operations.

The difference in speed can be substantial, especially for long data sets where N may be in the thousands or millionsin practice, the computation time can be reduced by several orders of magnitude in such cases, and the improvement is roughly proportional to N/log(N). This huge improvement made many DFT-based algorithms practical; FFTs are of great importance to a wide variety of applications, from digital signal processing and solving partial differential equations to algorithms for quick multiplication of large integers. A simple matrix-factoring example is used to intuitively justify the FFT algorithm. The factored matrices are alternatively represented by signal flow graphs. From these graphs, we construct the logic of an FFT computer program theoretical development of various forms of the FFT algorithm.

II. Matrix Formulation: The discrete Fourier transform is given by: X(n) = k=0 N-1 x(k)e-j2nk/N n=0, 1,..N-1 (1.1)

Eq (1.1) describes the computation of N equations. For eg. If N=4 and we let W= e-j2n/N (1.2) Then Eq (1.1) can be written as X (0) = x (0) W0 + x (1) W0 + x (2) W0 + x (3) W0

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

X (1) = x (0) W0 + x (1) W1 + x (2) W2 + x (3) W3 X (2) = x (0) W0 + x (1) W2 + x (2) W4 + x (3) W6 (1.3) X (3) = x (0) W0 + x (1) W3 + x (2) W6 + x (3) W9 The above equations can be represented in the matrix form. III. Proposed Algorithm By far the most common FFT is the Cooley-Tukey algorithm. This is a divide and conquer algorithm that recursively breaks down a DFT of any composite size N = N1N2 into many smaller DFTs of sizes N1 and N2, along with O(N) multiplications by complex roots of unity traditionally called twiddle factors (after Gentleman and Sande, 1966). This method (and the general idea of an FFT) was popularized by a publication of J. W. Cooley and J. W. Tukey in 1965, but it was later discovered (Heideman & Burrus, 1984) that those two authors had independently re-invented an algorithm known to Carl Friedrich Gauss around 1805 (and subsequently rediscovered several times in limited forms). The most well-known use of the Cooley-Tukey algorithm is to divide the transform into two pieces of size N / 2 at each step, and is therefore limited to power-of-two sizes, but any factorization can be used in general (as was known to both Gauss and Cooley/Tukey). These are called the radix-2 and mixed-radix cases, respectively (and other variants such as the splitradix FFT have their own names as well). Although the basic idea is recursive, most traditional implementations rearrange the algorithm to avoid explicit recursion. Also, because the Cooley-Tukey algorithm breaks the DFT into smaller DFTs, it can be combined

arbitrarily with any other algorithm for the DFT, such as those described below. FFT ALGORITHMS SPECIALIZED FOR REAL AND/OR SYMMETRIC DATA: In many applications, the input data for the DFT are purely real, in which case the outputs satisfy the symmetry

and efficient FFT algorithms have been designed for this situation (see e.g. Sorensen, 1987). One approach consists of taking an ordinary algorithm (e.g. Cooley-Tukey) and removing the redundant parts of the computation, saving roughly a factor of two in time and memory. Alternatively, it is possible to express an even-length real-input DFT as a complex DFT of half the length (whose real and imaginary parts are the even/odd elements of the original real data), followed by O(N) postprocessing operations. It was once believed that real-input DFTs could be more efficiently computed by means of the discrete Hartley transform (DHT), but it was subsequently argued that a specialized real-input DFT algorithm (FFT) can typically be found that requires fewer operations than the corresponding DHT algorithm (FHT) for the same number of inputs. Bruun's algorithm (above) is another method that was initially proposed to take advantage of real inputs, but it has not proved popular. There are further FFT specializations for the cases of real data that have even/odd symmetry, in which case one can gain another factor of (roughly) two in time and memory and the DFT becomes the discrete cosine/sine transform(s) (DCT/DST). Instead of directly modifying an FFT algorithm for these cases, DCTs/DSTs can also be computed via FFTs of real data combined with O(N) pre/post processing. SIGNAL FLOW GRAPH:

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

The signal flow graph shown below is given for 8 samples. The figure shows the various computation arrays represented by l and it also represents the nodes to be skipped. The logic of skipping nodes is done to reduce the number of computation. As we can see from the signal flow graph that the value of x1(8) and x1(0) can be calculated from the same set of inputs according to eq(1.4) shown below. As we can see that each node has a dual node(a dual node is a node which has the same set of inputs)eg x(0) and x(8) are dual nodes which are separated by N/2l. Hence is xl(k) is a node then its dual node is xl(k + N/2l ).The computation of any dual node pair is done by xl(k) = xl-1(k) + WP xl-1(k + N/2l ). xl(k + N/2l )= xl-1(k) - WP xl-1(k + N/2l ). (1.4) For the signal flow array we can see that as we proceed downward for the computational array l=1 starting from node k=0 for node k=4 we see that we have already computed the value of that node and it can be skipped. Similarly other nodes can also be skipped (k= 5, 6, 7). Thus we only compute for the 1 st N/2l nodes and skip the next N/2l nodes. We know to stop skipping when we reach a node index grater than N-1.

## STEPS INVOLVING IN COMPUTATION OF THE FFT:

THE

The FFT of an input sequence of N samples is given by X(n)= WP x(k) (1.5) Here X(n ) is the FFT, WP is the twiddle factor and x(k) is the value of the input sequence. For the computation of the above equation the value of the Twiddle Factor must be known WP = e-j2p/N The value of p is determined by following the steps as explained under: (a) Writing the index k in binary form with bits where is the power of 2 corresponding to the value of N. (b) Scaling or sliding this binary number -l bits to the right and filling in the newly opened bit position on the left with zeros, Here l is the value determining the computational array which varies from 1 to . (c) Then reversing the order of the bits. This bit reversed number is then converted to decimal. This number is the value of p. After the calculation of the values of X(n) the values are unscrambled as they are obtained in reversed order according to the Cooley tukey algorithm. To unscramble the output the vector X(n) is to write n in binary and reverse or flip the binary number. We show in fig (a)_ the results of this bit reversing operation : terms x(k) and x(i) have been

Figure (b)

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

interchanged , where I is the integer obtained by bit reversing the integer k. Note that a situation occurs when we encounter a node that has previously been interchanged. For example in Fig (b1) node k=0 remains in its location, nodes k=1, 2, 3 are interchanged with nodes 8, 4 and 12 respectively. The next node to be considered is node 4, but this node was previously been interchanged with node 2. to eliminate the possibility of considering a node that has previously been interchanged, we simply check to see if i(the integer obtained by bit reversing k) is less than k. if so, this implies that the node has been interchanged by a previous operation . With this check, we can ensure a straight forward unscrambling procedure. Considering the input sequence of 16 elements FFT Computation Flowchart: Using the discussed properties of the FFT we can easily develop a flowchart for programming the algorithm on a digital computer. We know from the previous discussions that we first compute array l=1 by starting at node k=0 and working down the array. At each node k, we compute the pair of eq(1.4) where p is determined by the described procedure. We continue down the array computing the equation pair of eq(1.4) until we reach a region of nodes that must be skipped over. We skip over the appropriate nodes and continue until the new have computed the entire array. We then proceed to compute the remaining arrays using the same procedures. Finally, we unscramble the final array to obtain the output.

Figure (b1) Figure (c) represents the flowchart of the computer programming. Box1 describes the necessary input data. Data vector x(k) is assumed to be complex and is indexed as k=0,1, N-1. if x(k) is real then the imaginary part should be set to zero. The number of sample points must satisfy the relationship N= 2 where is integer valued. Initialization of the various program parameters is accomplished in Box 2. Parameter l is the array number being considered. We start with array

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

l=1,N2= N/2 is initialized as such. Parameter NU1 is the right shift required when determining the value of p in eq(1.4); NU1 is initialized to -1. The index k of the array is initialized to k=0 ; thus we will work from the top and progress down the array.

To accomplish this, we compute the integer value of k/ 2NU1 and set the result to M as shown in box 5. According to the procedure for determining the p, we must bit reverse M, where M is represented by = NU bits. The function IBR(M) denoted by box 5 is a special function routine for bit inversion.

Figure(c) Box3 checks to see if the array l to be computed is greater than . If yes, then the program branches to box23 to unscramble the computed results by bit inversion. If all arrays have not been computed, then we proceed to Box 4. Box 4 sets a counter I=1. This counter monitors the number of dual node pairs that have been considered. Since it is necessary to skip certain nodes in order to ensure that previously considered nodes are not encountered a second time, Counter I is the control for determining the when the program must skip. Boxes 5 and 6 perform the computation of eq(1.4). Because k and I have been initialized to 0 and 1, respectively, the initial node considered is the first node of the first array. To determine the factor p for this node, we must first scale the binary number k to the right by -1 bits.

Box 6 is the computation of eq (1.4). We compute the product WP xl-1(k + N2 ) and assign the result to a temporary storage location. Next, we add and subtract this term according to eq (1.4). The result is the dual output. We then proceed down the array according to the next node. As shown in the Box 7, k is incremented by 1Toavoidrecomputing a dual node that has been considered previously, we check Box 8 to determine if the counter I is equal to N2. For array 1, the number of nodes that can be considered consecutively without skipping is equal to N/2 = N2. Box 8 determines this condition. If I is not equal to N2 then we proceed down the array and increment the counter I, as shown in Box 9. Since we have already incremented k in Box 7. Boxes 5 and 6 are then repeated for the new value of k. if I=N2 in box 8, then we know that we have reached a node previously considered. We then skip N2 nodes by setting k=k + N2. Because k has already been incremented by 1 in Box 7, it is sufficient to skip the previously considered nodes by incrementing k by N2. Before we perform the required computations indicated by Boxes 5 and 6 we must first check to see that we have not exceeded the array size. As shown in Box 11, if k is less than N1 then we reset the counter I to 1 in Box 4 and repeat Boxes 5 and 6. If k> N-1 in Box 11, we show that we must proceed to the next array. Hence, as shown in Box 12, l is indexed by 1. The new spacing N2 is simply N2/2. NU1 is decremented by 1 and k is reset to zero. We then check Box 3 to see if all arrays have been computed. If so then we proceed to unscramble the final results. This operation is performed by Box 13.

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

Considering and input sequence of N= 8 samples. The FFT of the sequence can be calculated using the above function in MATLAB as shown Code for 1-D FFT function: The code for 1-D FFT is written in MATLAB as shown under. function y = FFT_func(x) a=size(x); N=a(2); n=log2(N); l=1; N2=N/2; NU1=n-1; k=0; while(l<=n) while(k<=N-1) I=1; while(I<=N2) M=fix(k/(2^NU1)); b=dec2bin(M,n); q=seqreverse(b); P=bin2dec(q); W=exp(-2*i*P*pi/N); T1=W*(x(k+1+N2)); x(k+1+N2)=x(k+1)-T1; x(k+1)=x(k+1)+T1; k=k+1; I=I+1; end k=k+N2; end l=l+1; N2=N2/2; NU1=NU1-1; k=0; end [y,ic]=bitrevorder(x); x=[1 2 3 4 5 6 7 8]; %%%%Generates an input sequence%%%% y= FFT_func(x);%%%% Takes the FFT of the sequence%%%% disp(y)%%%%Displays the output%%%% Then the output is displayed as Columns 1 through 6 36.0000 -4.0000 + 9.6569i -4.0000 + 4.0000i -4.0000 + 1.6569i -4.0000 4.0000 - 1.6569i Columns 7 through 8 -4.0000 - 4.0000i -4.0000 - 9.6569i Multidimensional FFTS: As defined in the multidimensional DFT article, the multidimensional DFT

## transforms an array vector of indices of d nested

V. Results:

with a d-dimensional by a set summations (over for each j), where the division , defined as , is performed element-wise. Equivalently, it is simply the composition of a sequence of d sets of onedimensional DFTs, performed along one dimension at a time (in any order). This compositional viewpoint immediately provides the simplest and most common multidimensional DFT algorithm, known as the

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

row-column algorithm. That is, one simply performs a sequence of d one-dimensional FFTs (by any of the above algorithms): first you transform along the n1 dimension, then along the n2 dimension, and so on (or actually, any ordering will work). This method is easily shown to have the usual O(NlogN) complexity, where

is the total number of data points transformed. In particular, there are N / N1 transforms of size N1, etc., so the complexity of the sequence of FFTs is:

There are other multidimensional FFT algorithms that are distinct from the row-column algorithm, although all of them have O(NlogN) complexity. Perhaps the simplest non-row-column FFT is the vector-radix FFT algorithm, which is a generalization of the ordinary Cooley-Tukey algorithm where one divides the transform dimensions by a vector of radices at each step. The simplest case of vectorradix is where all of the radices are equal (e.g. vector-radix-2 divides all of the dimensions by two), but this is not necessary. Vector radix with only a single non-unit radix at a time, i.e. , is essentially a row-column algorithm. Algorithm for 2D-FFT: The steps involved in the computation of a 2 dimensional sequence of dimension N x N (where N=2) are as follows:(a) First the row wise FFT is computed and the output is arranged row wise. (b) Then the column wise FFT is computed and the elements are arranged column wise. This gives us the output. Hence we can see that for the computation of 2D- FFT we take the 1D-FFT twice. Code For 2D-FFT Function: The code for a 2D FFT function is generated in MATLAB is shown below. function x =FFT2_func(x) a=size(x); R=a(1); C=a(2); for k=0:R-1 %%%%%Row FFT%%%%% y=x(k+1,:); x(k+1,:)=FFT_func(y); end

In two dimensions, the can be viewed as an matrix, and this algorithm corresponds to first performing the FFT of all the rows and then of all the columns (or vice versa), hence the name. In more than two dimensions, it is often advantageous for cache locality to group the dimensions recursively. For example, a threedimensional FFT might first perform twodimensional FFTs of each planar "slice" for each fixed n1, and then perform the one-dimensional FFTs along the n1 direction. More generally, an asymptotically optimal cache-oblivious algorithm consists of recursively dividing the dimensions into two groups and that are transformed recursively (rounding if d is not even). Still, this remains a straightforward variation of the rowcolumn algorithm that ultimately requires only a one-dimensional FFT algorithm as the base case, and still has O(NlogN) complexity. Yet another variation is to perform matrix transpositions in between transforming subsequent dimensions, so that the transforms operate on contiguous data; this is especially important for out-of-core and distributed memory situations where accessing non-contiguous data is extremely time-consuming.

wise

## International Journal of Science, Technology & Management; December-2010 www.ijstm.com editor@ijstm.com

for m=0:C-1 %%%%%Column FFT%%%%% y(1,:)=x(:,m+1); x(:,m+1)=FFT_func(y); end %%%%%%Display the output%%%%% disp(x); Output:

wise

From table 1 we can see that for a couple of samples of various lengths the time taken by normal DFT is higher than the time taken by FFT. Thus the speed of FFT is beneficial when large calculations are required. VII: References: [1] M.S. Alam, O. Perez and M. A. Karim Preprocessed multiobject joint transform correlators, Appl. Opt. 32(17), 3102-3107(1993). [2] S. Zhong, j. Jiang, S. Liu, and C. Li. Binary joint transform correlators based in differential processing of the joint transform power spectrum, Appl. Opt. 36(8), 17761780(1997). [3] S. Pati and K. Singh, Illumination sensitivity of joint transform correlators using differential processing: computer simulation and experimental studies, Opt. Comm. 147, 2632(1998). [4] G. Unnikrishnan, J. Joseph, and K. Singh, A non zero order joint transform correlators for space variant pattern recognition Opt. Comm. 171,149-158(1999) [5] F. T .S. Yu, and S. Jutamulia, Eds., Optical Pattern Recognition, Cambridge University Press(1998). [6] R. K. Wang, L. Shang, and C. R. Chatwin, Modified Fringe Adjusted Joint Transform Correlation to accommodate noise in the input scene, Appl. Opt. 35(2). 286-296(1996) [7] C. S. Weaver and J. W. Goodman, Technique for optically convolving two functions, Appl. Opt. 5, 1248(1966) [8] M. S. Alam, A. A. S. Awwal, and M. A. Karim, Improved correlation discrimination using joint transform optical correlators,Micro. Opt. Tech. Lett. 4, 103106(1991) [9] F. T. S. Yu, F. Cheng, T. Nagata, and D. A. Gregory, Effect of fringe binarization of multi-object joint transform correlation, Appl. Opt. 28, 2988-2990(1989) [10] www.wikipedia.com [11] The Fast Fourier Transform IEEE Trans. On Education (March 1969), Vol. 12 [12] A guided tour of the Fast Fourier Transform IEEE Trans. on Education, Vol. 6.

Let us consider an input sequence of dimension 4x4. Then the 2D-FFT of the sequence can be calculated by:x=[1 2 3 4;1 2 3 4;1 2 3 4;1 2 3 4] %%%%%Generation of a 4x4 matrix%%%%% y=FFT2_func(x); %%%%%Computation of 2D FFT%%%%% disp(y) %%%%%Display the output%%%%% Then the output will be shown as: 40.00 -8.00 + 8.00i 8.00 - 8.00i 0 0 0 0 0 0 0 VI. Conclusion: Comparison between Computation Speeds: N 128 256 512 Table 1 Here N is the number of input samples. DFT and FFT -8.00 0 0 0 0 0

Time taken in Time taken DFT FFT 0.2187s 0.0312s 0.5781s 2.1094s 0.0625s 0.09375s

in