Audio/Image Processing Using 2D FFT

CALIFORNIA STATE UNIVERSITY NORTHRIDGE
Audio/Image Processing in Frequency Domain Using 2D FFT
A graduate project submitted in partial fulfillment of the requirements
For the degree of Master of Science
in Electrical Engineering
By
Ameneh Mousavi
December 2014
The graduate project of Ameneh Mousavi is approved:
__________________________________ ______________________
Dr. Xiyi Hang Date
__________________________________ ______________________
Dr. Ramin Roosta Date
__________________________________ ______________________
Dr. Shahnam Mirzaei, Chair Date
California State University, Northridge
ii
Acknowledgement
I would never have been able to finish my master project without the guidance of
my advisor, help from committee members and support from my family and husband.
I would like to express my deepest gratitude to my advisor, Dr. Shahnam Mirzaei, for
his guidance, caring, patience, motivation, and enthusiasm. I would like to thank
Professor Roosta , who has always believed in me, and supported me throughout my
master studies. His advice, support, and friendship have been invaluable on both
academic and personal levels. I also would like to thank Dr. Hang for his support for
being my project committee member. I really appreciate his time and consideration
toward helping me.
I would like to thank my parents who have always supported me and encouraged me
with their love, and dedication. I would have never been able to get here without their
support.
Last but not the least I would like to thank my husband, Roozbeh, who was always there
cheering me up and stood by me through the whole good and bad times.
iii
I dedicate this graduate project to
my family, and my beloved husband, Roozbeh
for their constant support and unconditional love.
I love you all dearly.
iv
Table of Contents
Signature page ....................................................................Error! Bookmark not defined.
Acknowledgement ............................................................................................................. iii
List of Figures .................................................................................................................. viii
Abstract ............................................................................................................................ xiii
Fourier Transform, Fast Fourier Transform and their applications .................................... 1
Introduction to Fourier Transform .................................................................................. 1
Discrete Fourier Transform (DFT) and Fast Fourier transform (FFT) ........................... 3
Cooley-Tukey FFT Algorithm ........................................................................................ 4
FFT in Image Processing ................................................................................................ 8
FFT and Spectrogram ................................................................................................... 10
FPGA design process ........................................................................................................ 13
Introduction to FPGA ................................................................................................... 13
FPGA vs. ASIC ............................................................................................................. 14
FPGA Architecture ....................................................................................................... 16
FPGA design process .................................................................................................... 20
Design entry .................................................................................................................. 22
Test development .......................................................................................................... 22
Behavioral simulation ................................................................................................... 23
v
Design synthesis............................................................................................................ 23
Place and route .............................................................................................................. 23
Timing analysis ............................................................................................................. 24
Post-synthesis simulation (timing simulation) .............................................................. 24
FPGA programming...................................................................................................... 25
Hardware debug and verification .................................................................................. 25
Chipscope Xilinx test and debug tool ............................................................................... 26
An introduction to chipscope ........................................................................................ 26
Chipscope structure ....................................................................................................... 27
ILA core ........................................................................................................................ 27
ICON core ..................................................................................................................... 28
VIO core........................................................................................................................ 28
How to connect the chipscope cores and setup the test system .................................... 28
Spectrogram implementation using Matlab and FPGA .................................................... 31
spectrogram system implementation using Matlab....................................................... 31
FPGA based spectrogram system ................................................................................. 40
FFT Core Implementation in FPGA ............................................................................. 44
Audio processing using implemented spectrogram .......................................................... 51
Spectrogram Audio result analysis ............................................................................... 73
Image processing using 2D FFT Matlab and FPGA ......................................................... 76
vi
2D FFT implementation using Matlab .......................................................................... 76
2D FFT implementation on FPGA ............................................................................... 80
Conclusion ........................................................................................................................ 88
References ......................................................................................................................... 90
Appendix A ....................................................................................................................... 92
Appendix B ....................................................................................................................... 97
vii
List of Figures
Figure 1. Fourier transform of cosine function which oscillates 3 cycles per second ........ 2
Figure 2. Fourier Transform of a step function................................................................... 2
Figure 3. Splitting N point DFT to two N/2 point DFTs .................................................... 6
Figure 4. Cooley Tukey splitting for 8 point DFT .............................................................. 7
Figure 5. FFT of an image that has all frequencies............................................................. 9
Figure 6. FFT of an image with Vertical wide stripes ........................................................ 9
Figure 7. FFT of an image with diagonals stripes .............................................................. 9
Figure 8. Spectrograms of a Wyle’s scream call .............................................................. 11
Figure 9. Spectrogram of a Wyle’s Moan Call ................................................................. 11
Figure 10. 3D surface spectrogram of a piece of music ................................................... 12
Figure 11. FPGA architecture ........................................................................................... 17
Figure 12. Programmable Interconnect details ................................................................. 18
Figure 13. A basic CLB structure ..................................................................................... 18
Figure 14. Logic Cell structure ......................................................................................... 19
Figure 15. FPGA design process ...................................................................................... 21
Figure 16. Chipscope debug cores connection to core under the test ............................... 29
Figure 17. Matlab code written to compute FFT on a combined Sine wave ................... 32
Figure 18. FFT computed using Matlab function and self-implemented one no noise at
input .................................................................................................................................. 33
Figure 19. FFT computed using Matlab function and self-implemented with noise added
to input signal .................................................................................................................... 34
viii
Figure 20. Matlab code modification to read from a text file to load input of FFT ......... 35
Figure 21. Matlab code that generates the input file for Matlab and FPGA spectrogram
system ............................................................................................................................... 36
Figure 22. Spectrogram of Blueatlx Wale sound .............................................................. 37
Figure 23. Spectrogram of BluePacx Wale sound ............................................................ 38
Figure 24. Spectrogram of Eaglet bird sound ................................................................... 38
Figure 25. Spectrogram of Falcon bird sound .................................................................. 39
Figure 26. Spectrogram of Mallard Duck quacking sound ............................................... 39
Figure 27. Block diagram of the spectrogram system implemented in FPGA ................. 40
Figure 28. Timing diagram of the control signals of the FFTcore.................................... 41
Figure 29. FFT Timing input signals simulation result, generated by the VHDL code (part
1) ....................................................................................................................................... 42
2) ....................................................................................................................................... 42
3) ....................................................................................................................................... 43
4) ....................................................................................................................................... 43
Figure 33. pipelined streaming IO FFT core implementation in Xilinx FPGA family .... 44
Figure 34. Block diagram of the system with chipscope cores connection ...................... 46
Figure 35. chipscope spectrogram result for sin(50Hz) + sin(120Hz)............................. 47
Figure 36. Matlab systems spectrogram result for sin(50Hz) + sin(120Hz).................... 47
Figure 37. chipscope spectrogram result for Eaglet Bird sound ....................................... 48
ix
Figure 38. chipscope spectrogram result for Falcon Bird sound ...................................... 48
Figure 39. chipscope spectrogram result for Mallard Duck quacking .............................. 49
Figure 40. chipscope spectrogram result for horned Owl sound ..................................... 49
Figure 41. Matlab module to read, process, and save the result automatically (part1) .... 52
Figure 42. Matlab module to read, process, and save the result automatically (part2) .... 53
Figure 43. Spectrogram result for Solo Piano (sample 1) ................................................. 55
Figure 44. Spectrogram result for Solo Piano (sample 2) ................................................. 55
Figure 45. Spectrogram result for Solo Guitar (sample 1)................................................ 56
Figure 46. Spectrogram result for Solo Guitar (sample 2)................................................ 56
Figure 47. Spectrogram result for Solo Saxophone (sample 1) ........................................ 57
Figure 48. Spectrogram result for Solo Saxophone (sample 2) ........................................ 57
Figure 49. Spectrogram result for Solo Violin (sample 1)................................................ 58
Figure 50. Spectrogram result for Solo Violin (sample 2)................................................ 58
Figure 51. Spectrogram result for Solo Drum (sample 1) ................................................ 59
Figure 52. Spectrogram result for Solo Drum (sample 2) ................................................ 59
Figure 53. Spectrogram result for Solo Flute (sample 1).................................................. 60
Figure 54. Spectrogram result for Solo Flute (sample 2).................................................. 60
Figure 55. Spectrogram result for Heavy Metal music (sample 1) ................................... 61
Figure 56. Spectrogram result for Heavy Metal music (sample2) .................................... 61
Figure 57. Spectrogram result for RAP music (sample 1) ................................................ 62
Figure 58. Spectrogram result for RAP music (sample 2) ................................................ 62
Figure 59. Spectrogram result for Country Music (sample 1) .......................................... 63
Figure 60. Spectrogram result for Country Music (sample 2) .......................................... 63
x
Figure 61. Spectrogram result for ROCK music (sample 1) ............................................ 64
Figure 62. Spectrogram result for ROCK music (sample 2) ............................................ 64
Figure 63. Spectrogram result for JAZZ music (sample 1) .............................................. 65
Figure 64. Spectrogram result for JAZZ music (sample 2) .............................................. 65
Figure 65. Spectrogram result for Techno music (sample 1) ............................................ 66
Figure 66. Spectrogram result for Techno music (sample 2) ............................................ 66
Figure 67. Spectrogram result for classical music from Beethoven (sample 1) ............... 67
Figure 68. Spectrogram result for classical music from Beethoven (sample 2) ............... 67
Figure 69. Spectrogram result for classical music from Tchaikovsky (sample 1)............ 68
Figure 70. Spectrogram result for classical music from Tchaikovsky (sample 2) ............ 68
Figure 71. Spectrogram result for classical music from Mozart (sample 1) ..................... 69
Figure 72. Spectrogram result for classical music from Mozart (sample 2) .................... 69
Figure 73. Spectrogram result for classical music from Vivaldi (sample 1) .................... 70
Figure 74. Spectrogram result for classical music from Vivaldi (sample 2) .................... 70
Figure 75. Spectrogram result for classical music from Bach (sample 1) ....................... 71
Figure 76. Spectrogram result for classical music from Bach (sample 2) ........................ 71
Figure 77. Spectrogram result for classical music from Schubert (sample 1) .................. 72
Figure 78. Spectrogram result for classical music from Schubert (sample 2) .................. 72
Figure 79. Two different plays of Adoration of the earth ................................................. 74
Figure 80. Two different plays of Kiss of the earth .......................................................... 75
Figure 81. Matlab module for 2D FFT on image (part1) .................................................. 77
Figure 82. Matlab module for 2D FFT on image (part2) .................................................. 78
Figure 83. Image and its 2D FFT result in ........................................................................ 79
xi
Figure 84. Image and its 2D FFT result in Matlab ........................................................... 79
Figure 85. Image and its 2D FFT result in Matlab ........................................................... 79
Figure 86. 2D FFT Matlab code (part 1)........................................................................... 81
Figure 87. 2D FFT Matlab code (part 2)........................................................................... 82
Figure 88. fft implementation usinf VHDL (part1) .......................................................... 84
Figure 89. 2D fft implementation usinf VHDL (part2) .................................................... 85
xii
Abstract
Audio/Video Processing in Frequency Domain Using 2D1 FFT2
By
Ameneh Mousavi
Master of Science in Electrical Engineering
FFT and similar frequency transforms have a lot of different applications in
today’s advanced technology. FFT is a mathematical method to convert a signal from
time domain to frequency domain. Frequency domain transforms like FFT are widely
used in image processing and enhancement techniques. Medical devices like MRI3 and
CT4 scan are using image processing using FFT to process patient’s body images. It’s
also being used in audio and speech processing. The objective of this project is to use
FFT to process audio signals and create their Spectrogram in order to differentiate
between different music styles and instruments.
The work includes processing different animal voices to detect their frequency
domain and also find difference between their voices in different situation. Other part is
to process different music styles and instruments using spectrogram to see if we could use
1
Two Dimensional
2
Fast Fourier Transform
3
Magnetic Resonance Imaging
4
Computerized Tomography
xiii
it to distinguish between different instruments in a play, music styles, or musicians
without listening to the music itself.
The other area of concentration is using FPGA to implement the spectrogram and
adding chipscope IP5 core to the hardware to be able to test and debug the implemented
Spectrogram device and also use the chipscope to show the results on computer screen.
Introduction of FPGA was mostly for the purpose of testing and debugging, but now a
day because of its fast time to market time and also ease of use, it’s being used frequently
to design different digital systems in several applications. Hardware modules have been
designed using VHDL6 programming Language and implemented using AtlysTM
Spartan-6 Xilinx FPGA7 Evaluation board.
5
Intellectual Property
6
VHSIC(Very High Speed Integrated Circuit) Hardware Description Language
7
Field Programmable Gate Array
xiv
Fourier Transform, Fast Fourier Transform and their applications
Introduction to Fourier Transform
Fourier Transform has been introduced by Joseph Fourier using its known Fourier
series to transform a signal from time domain to the frequency domain. There is also an
Inverse Fourier Transform which is used to do the transform in the reverse direction from
frequency to time domain. Using this method we could find out what frequency
components exists in the processed input signal. [1]
As mentioned, the origin of this essential method comes from Fourier series
which rewrites a complicated signal using sum of Sine and Cosine components. The
formula which is used to transform an analog signal is . This
complex exponential component is the main part of the transform and is coming from
Euler’s formula for Sine and Cosine functions. Based on this formula:
. Complex Exponentials are periodic and a set of them is complete,
so Fourier transform is able to represent a continuous function with less error in compare
to the original one. This property has lead this transform to be one of the most useful and
functional transforms [2].
In addition to indication the number of frequencies that exist in a signal, Fourier
transform also tell us how much of each frequency component presents. [4] A plain-
English metaphor answers some questions about concept of Fourier transform. Here is
the metaphor [3]:
1
- What does the Fourier transform do? Given a smoothie, it finds the recipe.
- How? Run the smoothie through filters to extract each ingredient.
- Why? Recipes are easier to analyze, compare, and modify than the smoothie itself.
- How do we get the smoothie back? Blend the ingredient.
In fact, Fourier transform finds existing frequencies of any function. Figure 1 [2] and 2
[4] show two different example of how Fourier transform works.
Figure 1. Fourier transform of cosine function which oscillates 3 cycles per second
Figure 2. Fourier Transform of a step function
2
Discrete Fourier Transform (DFT) and Fast Fourier transform (FFT)
Fourier transform application is in the analog world, so to be able to use this
fascinating tool in the digital word we needed to have a digitalized version of it. DFT is
the digitalized version of Fourier transform. In fact, DFT converts a series of signal
samples into their frequency components. The input and output samples are both complex
numbers. DFT is one of the most important transforms in digital signal processing area.
The goal of having discrete transform was to perform Fourier transform on the computer
data, and having a limited number of samples in DFT made this possible. The precision
of the transform depends on the number of input samples, so by having more signal
samples we can find more frequency components of the input signal. The definition of
DFT for a series of N complex number x0, x1, …, xN-1 is as equation below [5]:
Efficiency is an important factor in computer processing system. Using above
formula to compute DFT wasn’t that efficient, so they had to find a better algorithm to
compute DFT on discrete data. FFT is an algorithm to compute DFT fast and efficient. It
does that by factorizing the DFT matrix into the product of sparse factors, as the result in
many applications FFT is being used as the major transform algorithm. A regular DFT
formula has the complexity order of O(N2), while FFT complexity order has been
reduced to O(N log N) which is really fundamental in digital processing speed. There are
several algorithms implementing FFT, the common one (which also has been used in this
project) is called Cooley-Tukey algorithm. Other FFT algorithms include Prime-factor,
Bruun’s, Rader’s and Bluestein’s FFT algorithm [6].
3
Cooley-Tukey FFT Algorithm
Cooley-Tukey is an algorithm that efficiently computes the DFT and reduces the
complexity of the DFT. It was introduced by Gauss, but it wasn’t recognized at that time.
In 1965 Cooley and Tukey published a paper regarding this algorithm and explained how
to perform it on the computer. At that time because digital computers were growing and
there was a need to compute the DFT fast, this algorithm got recognized. It uses butterfly
method to compute the FFT[7][8][9][10].
The idea is coming from splitting an N point DFT to two N/2 point DFT. One is
performed over odd samples and the other one over even samples. Expressions below
show how splitting the DFT to two N/2 point DFTs will reduce the complexity of the
computation[7][8][9][10].
This is the traditional DFT formula and we need N complex multipliers, and N-1
complex adds to compute that for each k, so for all N samples it will be O (N2)
complexity. But with FFT we’ll get to the complexity which is a big difference
in big N values. To start we define as and split out N points to two N/2 points
and calculate the DFT on each one. So we’ll have[7][8][9][10].
4
Then we replace even onens with n = 2r and odd ones with n = 2r + 1, r = 0, 1, …, N/2 -1
and we’ll have:
We’ll factor out the terms of W that doesn’t depends on the r:
Based on the W features, we know that , by using that
we’ll have:
So and the complexity is going to be O(N2 /2 + N). Figure 3
shows how to split the N samples to two groups and then combine them back to generate
the whole N point DFT[7][8][9][10].
5
Figure 3. Splitting N point DFT to two N/2 point DFTs
To get the most efficient method we’ll split the sample points till we get to the 2
point FFT. To get there we need to split it times. Figure 4 shows the Cooley Tukey
splitting algorithm for an 8 point input. The order of the samples at input after the
splitting is based on bit reverse order. It means when we represent the index value of the
sample input in binary and then reverse the bits, you’ll find the location of that sample.
Like 4 is 0100 and when we reverse it we’ll get 0010 which will be 2, or 6 (0010) will be
0100 [7][8][9][10].
6
Figure 4. Cooley Tukey splitting for 8 point DFT
Fast Fourier Transform Applications
FFT has a very useful algorithm which is used in a wide range of engineering,
mathematics and science. Some of these applications include Integer multiplication
(which makes a more efficient than “left-shifting-and-adding” or “Russian Peasant” ),
Signal processing (like capturing the human voice close to microphone based on air
pressure, trace the pattern of the stars at night), Image processing ( Medical imaging
devices like MRI, CT scan), filtering application ( because of being fast and efficient
plays a major rule in most of the filtering process and complex matrix
multiplication)[11][12]. All in all, whenever we are looking for a fast efficient method of
processing large amount of data FFT is the solution.
7
FFT in Image Processing
As mentioned above, Image processing is one of the most important areas of FFT
application. It is used in image analysis, image filtering, image construction, image
recognition (to find special objects in the picture), image enhancement (like improving or
changing the image) and image compression (to reduce the size of the picture in order to
make the transmission fast or need less space to store it). FFT on an image shows the
frequencies that has at least 5% of the main peak. Figure 5, 6 and 7 show some pictures
and their FFT result. In these pictures the existing frequencies in image are less than
1/100 of the DC-value, so they don’t have significant effect to the image[13][14].
Image in Figure 5 has almost all frequencies and the magnitude of each frequency
is much less than DC-value in image so it’s all black. Figure 6 shows the Fourier
transform a 2 pixels wide vertical stripes image. If we look at the result, we’ll see that it
contains DC-value and also two points corresponding to the frequencies of the stripes in
the original image. The reason that these two points are on the horizontal line in the
center is that intensity in time domain changes horizontally on this picture. Image 7
shows the Fourier transform of a diagonal strip. In this figure there are also two
frequencies and DC-value [13][14].
8
Figure 5. FFT of an image that has all frequencies
Figure 6. FFT of an image with Vertical wide stripes
Figure 7. FFT of an image with diagonals stripes
9
To process images using FFT two dimensional FFT is being used. One of the
methods is to use one dimensional FFT to perform a two dimensional one. This method
has been used in this project. The way that we perform is to save picture pixels in a 2D
matrix and then apply one dimensional FFT to all rows and over write the result on the
same rows and then apply the FFT on all columns and matrix resulted will be the 2D FFT
of the original image.
FFT and Spectrogram
Spectrogram is a representation of the Spectrum of frequencies of any signal that
varies in time domain or some other variables. This device is used to analyze the audio
signals like animals, music, and human speech. The X- axis in the graph of spectrogram
shows frequencies and Y- axis would be the amplitude of each frequency [19]. Using
spectrogram in speech processing, we can detect every phoneme using its own unique
frequency. The phonemes also combine a detectable way to create vowels and words. So
it would be really useful to have the spectrogram of a human speech, so it widely is used
in study of phonetics and speech synthesis [15].
Animal’s calls like Wyle have an especial frequency which can be detected using
spectrogram. Spectrogram also can be used to recognize, identify and interpret bird call
sounds. Spectrogram is used in improving the speech defects and also speech training to
people who are deaf. They are used in speech filtering, development of RF8 and
microwave systems [16]. Figure 8 shows spectrogram of a Wyle’s scream call while
Figure 9 shows Wyle’s Moan call and Figure 10 shows a 3D surface Spectrogram of a
8
Radio Frequency
10
part of a piece of music. As you can see each scream start high and drop fast, but Moan
last longer than scream [17].
Figure 8. Spectrograms of a Wyle’s scream call
Figure 9. Spectrogram of a Wyle’s Moan Call
11
Figure 10. 3D surface spectrogram of a piece of music
One of the implementation methods of spectrogram is using FFT to get the frequency
components and then calculate the amplitude of the FFT result. This amplitude is going
to be the spectrogram result which shows how much of each frequency we have in the
input signal. The way that spectrogram has been implemented in this project is using
FFT.
12
FPGA design process
Introduction to FPGA
The first kind of Programmable Logic devices were PROMs9 and PLDs10. They both
had the ability of being programmed at the factory. By the time different companies like
Altera and Xilinx have started working on Programmable devices and introduced
CPLDs11 and FPGAs to the electronic market. CPLD was the generation before FPGA
and its complexity is somewhere between PALs12 and FPGAs. FPGAs are placed in the
class of integrated circuits which are designed to be configured by customers to build
their desired systems. The first purpose of creating programmable gate arrays was to test
and debug ASICs before manufacture them. But because of their interesting features and
capabilities they have been used to implement custom design digital systems by
themselves [18].
One of the most important features of the FPGA which has leaded it to be used more
often by different electronic development companies is being re-programmable.
Designers can design, test and debug their final product on FPGA several times and even
after manufacturing they can add new features and modify their design just by
reprogramming the FPGA used on the product. The other essential thing about FPGA is
the fast time to market feature. In this advanced electronic world that every day
9
Programmable Read-Only Memories
10
Programmable Logic Devices
11
Complex Programmable Logic Devices
12
Programmable Array Logic
13
thousands of different things are introduced to the market, having the ability of building
your product as early as possible is a vital need.
New FPGA devices in addition to lot of logic arrays and block RAMs and I/O Pins, have
soft or hard core processing systems inside them. Even some of them have Analog to
Digital hard core device inside them which makes them to be able to implement mixed-
signal systems. For instance Altera Cortex-A9 FPGA family has Dual-core ARM
processor inside it and its Cyclone-II family has Nios-II soft core CPU. Xilinx Zynq-7000
FPGA also has ARM based processing system.
FPGA vs. ASIC13
ASICs are customized ICs14 designed for a very specific use not a general purpose
system. Modern ASICs include microprocessors, RAMs15, ROMs16, Flash memories,
EEPROM17 and other large blocks. ASIC is mostly used when we want to design a very
large specific system with a lot of logic that consumes low power and is too fast that
can’t be implemented in FPGA. In fact ASICs are designed to be fully optimized in
aspects of logics, gates, power, and area [19] [20].
FPGAs and ASICs both implement complex designs at a high level of
performance. They both use HDL18 languages like VHDL and Verilog to implement the
13
Application-Specific Integrated Circuits
14
Integrated Circuits
15
Random Access Memories
16
Read-Only Memories
17
Electrically Erasable Programmable Read-Only Memories
18
Hardware Description Language
14
logic. But each of them has some advantages and disadvantages in comparison to other
one and designers should pick them based on their need.
FPGA design has no layout, masks or other manufacturing steps, so it has a faster
time to market. You don’t need to be worried about the NRE19 which is cost of
development and also cost of manufacturing. They have simpler design cycle which is
due to the software which handles most of the routing, timing, and placement parts.
These are the design process parts that take most of the design and development time
which is eliminated in FPGA design process. Reprogramability is one other feature in
FPGA design. Any new bitstream can be uploaded immediately and there is no extra time
and cost for that while in ASIC design it can take $50000 or more and about a month to
do the same thing. Reusability is an essential advantage of FPGA, you could make your
prototype on a FPGA and if there is any bug reprogram it and retest it again. FPGAs are
good for small volume designs and also the power consumption is more than ASIC, you
are also limited to the existing resource inside the FPGA [21][22][23].
ASIC design has a lower unit cost for very high volume production. If the volume
is really high like more than 250 K logic density then ASIC would have less cost than
developing using FPGA. Using ASICs we could have a full capability of custom design.
For designs which low power and also high speed is crucial, ASIC could be a good
choice. You are not also limited in the amount of logic and your design could be as big as
you want. Because of having design flexibility in ASIC, it let us to have more speed
optimization. The other feature of ASIC design is to have the ability of implementing
19
Non-Recurring Engineering
15
analog design and mixed-signal design inside the ASIC. Although, FPGAs are also going
toward having the mixed-signal design ability[24].
FPGA Architecture
FPGAs are programmable Logic devices with CLBs20 which make designer’s desired
system using programmable interconnects. There are some OTP21 FPGAs available but
most of them are SRAM22 based and could be programmed several times during the
design development. The feature of programmability let engineers to change and modify
their design easily during the design process, even after they have manufactured their
product the modification and adding features are possible. There wouldn’t be extra cost
for the modification in compare to the ASIC design. The architecture of a FPGA consists
CLBs, programmable interconnects, I/O blocks, and Block RAMs. CLB is one of the
basic units in FPGA and each FPGA has certain number of CLBs based on its size
[25][26][27].
A basic CLB has configurable switches and some logic cells. Each logic cell
have some LUTs, shift registers, multiplexers and flip flops to be able create a
combinational or sequential design using those configurable switches. Programmable
interconnects provide the routing between CLBs, CLBs and I/O blocks, and also clock
routing inside the system. I/O blocks provide the connection from outside to inside of the
FPGA. There are different I/O banks throughout the FPGA and each supports some kind
of standard I/Os. Number of I/Os in new FPGAs has increased a lot. Block RAMs are
20
Configurable Logic Blocks
21
One Time Programmable
22
Static Random Access Memory
16
used to generate memory elements inside the design, so we could have on-chip memory
available for the design. Figure 11 to 14 illustrate FPGA architecture and some details
about interconnects, CLBs and logic cells [25][26][27].
Figure 11. FPGA architecture
17
Figure 12. Programmable Interconnect details
Figure 13. A basic CLB structure
18
Figure 14. Logic Cell structure
Using all Interconnects, CLBs, switches, and Block RAMS, the logic and routing
inside the FPGA could be very simple like a counter or a very complicated one like a
processor. Whenever a bit file is loaded inside the FPGA, each CLB implements a
particular logic using LUTs and then interconnect switches connect CLBs together to
make the whole system. Programming techniques are different in FPGA. SRAM, anti-
fuse, and EPROM are some of them that are currently used in FPGAs. In SRAM model
switches are controlled by SRAM bits. In SRAM based programming, the FPGA will
lose its data after we turn off the system. In anti-fuse model, by programming the anti-
fuses we make a low resistor path and the advantage of anti-fuse is the small size that it
has. Finally using EPROM to program, the switch is a floating gate which could be
turned off by injecting charge to it. An important feature of the EPROM is that after
19
power off FPGA doesn’t lose its programmed data and when we turn on the system it still
has its programmed data [25][27].
FPGA design process
The process of implementing a design on a FPGA can be divided into different
steps. These steps include design entry, test development, behavioral simulation, design
synthesis, functional simulation, place and route, timing analysis, post-synthesis
simulation (timing simulation), FPGA programming, on hardware debug and verification.
Figure 15 shows the process diagram[28].
20
Design Entry
Test Development
No
Design Synthesis Behavioral
Simulation
No
Functional
Simulation
Place, Route,
Implementation
Timing No
Simulation
Timing Analysis
FPGA Hardware Debug
Programming Yes
Done
No
Figure 15. FPGA design process
21
Design entry
The first step in FPGA design is design entry. In this step designers convert the
design ideas into a state machine, HDL codes, or schematic design. It depends on
designer to choose one of these methods based on the design needs. Hardware
Description Languages like VHDL and Verilog are the most commonly used ones. They
have the ability of designing very complex systems or a very simple one. As they are
used to design hardware, they have the parallel design feature and they are not like
regular software programs. HDL are one the best methods for design entry among others,
because they give the flexibility to designer to port their designs to other workspaces
easily, while schematic design entry isn’t flexible at all and makes it hard to port the
design to other platforms [29][30].
Test development
After the design entry is ready and we have our design in the format of an HDL
code or schematic, now it’s time to test if it works as it’s supposed to. To check and
verify the functionality of the design we need to have different test cases that check and
cover different parts of the system. Test cases should be comprehensive so we find and
resolve most of the system problems before we get to the hardware test. TCL23 scripting
language or other similar tools might be used to write test cases for the design.
23
Tool Command Language
22
Behavioral simulation
When we have the test case, we can use different simulation tools like Modelsim,
Aldec Riviera, and other similar tools to simulate the design behavioral and check if it
works properly. At this stage we do RTL24 simulation, because there are different levels
of simulation in the path of FPGA design and this is the first one. Behavioral simulation
is a high level simulation and doesn’t consider any actual gate delay when it simulates.
Doing behavioral simulation we could find as many bugs as possible and when we are
confident that the design is working fine, we’ll continue to the synthesis step. We’ll go
back to the code and do modification till the simulation passes with all test cases [29].
Design synthesis
One of the main steps in FPGA design is the synthesis. In this step the synthesis
tool converts our high level behavioral HDL code to a netlist of real logical primitives
offered by the vendor tool. Synthesis tool uses different optimization methods to make
the netlist as efficient as possible [80]. There might be some un synthesizable code styles
inside the HDL code which cause the tool to give us error, so the designer should go back
to the code and do the modification till the synthesizer is done successfully.
Place and route
After the synthesis is done, we have to use a tool to do the place and route. Most
of the current tools have both synthesizer and place and route tool at the same software.
24
Register Transfer Level
23
In this stage the tool will get the generated netlist by synthesis tool and a constraint file
and try to fit the design inside the target FPGA device while it meets the constraint. It’ll
interconnect all the primitives together to make the timing requirement. The most
important constraints are speed and delay. If it couldn’t make get to the performance that
we look for it’ll give us constraint violations and designers need to go back to design and
make it more efficient to pass the place and route with no error [29][30].
Timing analysis
Timing analysis tool check the design after place and route to check if all the
timing requirement are met. If there are some parts that don’t meet the timing constraint
it’ll report, so we have to go back to the synthesizer or even to the code and change and
modify it till it passes the timing analysis.
Post-synthesis simulation (timing simulation)
After the timing analysis is done we have a netlist which consists of primitives
with their real timing specifications and also all existing delays and path in the system.
To verify that the timings doesn’t cause any functional issue we have to use this netlist to
do a post synthesis simulation or timing simulation that we’ll consider all timing while
doing simulation. If the simulation doesn’t pass, designers should go back to the code and
try to modify the system to resolve the issue [29].
24
FPGA programming
After the timing simulation, the last step is to test the real hardware to make sure
there is no issue remained. To be able to debug on the hardware first we have to use the
programmer tool to load the generated bit file on the FPGA. Modern FPGAs have a
JTAG25 port that can be used to program and test the FPGA.
Hardware debug and verification
This is the last step in FPGA design process. After programming the FPGA using JTAG
port, now we can use different tools like logic analyzer to debug and test the real
hardware while it’s running. Some software like Quartus II and ISE has their own
internal Logic analyzer that designers could use to add internal signals and check their
value while the hardware is running. Debug on hardware because of the memory and pin
limitation of the logic analyzer is hard and time consuming, so the best way of test is to
first try to do it mostly using simulation. Because in simulation we have all the signals
available and tracing down the problem root is much easier than hardware.
25
Joint Test Action Group
25
Chipscope Xilinx test and debug tool
An introduction to chipscope
As mentioned in previous section, synthesis tools also have an option to add their
internal logic analyzer to your FPGA design and test your hardware using this embedded
logic analyzer. Chipscope is the internal logic analyzer for Xilinx Synthesis tools. When
you use chipscope it adds logic_analyzer, system analyzer, and Virtual I/O core to the
design allowing you to see your internal signals. Signals will be captured using the
system clock and displayed on the tool display. The key features of this tool include
[31][32]:
- Fast and easy way of setup
- Uses JTAG to interface with hardware and no other pins are required
- Ability of adding debug ports directly in the HDL code
- Analyze all internal signals even signals for embedded rocessor
- Chipscope core insertion is in the tool flow
- It provides full internal visibility
- Minimize number of external pins required for debug purposes
There are limitations in using chipscope too. The main reason is being embedded
inside the FPGA. These limitations are [33]:
- To have limited amount of sample memory. This tool is embedded in the design
and will use the rest of the remaining logic and memory in FPGA to capture
signals. So chipscope available resources depend on the size of the design and
26
FPGA itself. So in a design that we use most of the memory, there might not be
enough memory to add a chipscope too.
- Chipscope can’t sample as fast as a real logic analyzer, because its sampling rate
is also limited to the design clock rate, so it’s not possible for chipscope to show
glitches in the design.
Chipscope structure
To add a chipscope to your design you need to have three different cores
available. These three cores include ILA, ICON, and VIO cores.
ILA core
The first and main core to add is ILA core. It will make the flow between project
and chipscope core. We have to add the ILA beside the design and connect all trigger
signals (signals that we want to monitor) to it. ILA is actually the capture core that
captures signal values and sends them to be displayed. It’s a customizable logic analyzer
core that monitors the signal within the design. It has many features that are close to a
real logic analyzer like storage, trigger conditions. User can select the triggers width,
depth and data. It has multiple trigger ports and the trigger condition causes the core to
store the sample just when it meet the trigger condition [34][35].
27
ICON core
Icon core is an interface between JTAG on FPGA and other chipscope cores like
ILA and VIO. This core provides a communication path between chipscope software and
ILA, VIO cores using JTAG and it supports up to 15 core connections [36].
VIO core
VIO is the other core that we need to connect to make the chipscope ready for the design
test. This is a customizable core that will monitor and also drive the FPGA signals. It also
has detectors to detect rising and falling edges of the samples. It provides virtual LEDs
and other indicators through inputs and virtual buttons and controls though output ports
[37].
How to connect the chipscope cores and setup the test system
The first step before using the chipscope is to have a compiled project ready in your ISE
Design Suit. Then we have to instantiate ILA core beside the top module to be able to
connect triggers to it. Triggers are the signals that we would like to monitor using
chipscope. We might need to modify the top module and take out the internal signals that
we need to monitor. Every signal that we want to monitor should come out of our top
module. Based on the number of signals that we want to monitor, we generate an ILA
debug core with the exact number of trigger ports and we have to define the same width
for triggers and their correspondent signals [34][38][39].
28
After that we will generate VIO and ICON core and connect them together. Figure 16
shows the connection between these three debug cores and the main top design core.
When the whole connection is there we have to compile, synthesis, and implement the
whole project again. After the compilation is done and the generated bit file is ready, we
just reprogram the FPGA using new bit file and then use the option “analyze design using
chipscope” to bring the chipscope page up. In trigger page we can go to the signal tab and
it brings all existing triggers. We could change their name based on the signal names and
then add trigger conditions to the trigger setup page. After adding trigger conditions that
we want to check, we’ll run the chipscope and it will be triggered and show the signals
value when the condition is met [34][38][39].
Figure 16. Chipscope debug cores connection to core under the test
29
Another existing feature in chipscope is the ability of showing the analog format
of the signals. For instance, if we have an input or output which is a sine wave we could
go to bus plot tab and see the signal as an analog wave instead some digital values that
we can’t get anything from. To make the signal look like a real waveform, we should
change the bus radix to decimal and then run the chipscope to see the waveform. As
mentioned before, when the chipscope is triggered the result will go to the memory of
chipscope, we have the ability of changing the size of memory to make it bigger for
having more time range of triggered data available [33][34].
30
Spectrogram implementation using Matlab and FPGA
spectrogram system implementation using Matlab
As mentioned in previous chapters, one of the ways to implement spectrogram is
using FFT. The base core is an FFT module and to build a spectrogram from FFT core,
we just have to calculate the amplitude of the result to have the spectrogram output. As
the first step, I started using Matlab to have the reference working system to compare the
FPGA result with it to make sure the implemented system on FPGA is working as it’s
supposed to. Matlab has its own FFT function, but to get more familiar with the FFT
method, I’ve also written another Matlab code that implements the FFT by its formula.
So at the end there were two different spectrogram Matlab systems, one which used the
Matlab FFT fundtion and the other which implement FFT first and then use it to have the
spectrogram output.
As audio files are not that kind of high frequency waves, the decision was to use a
1024-point FFT core. Figure 17 shows the Matlab spectrogram system using Matlab FFT
function and self-implemented FFT function and how to compare their result to make
sure the implemented formula is working exactly like Matlab function.
Sum of a 50 Hz and 120 Hz sinusoid waveform was given to both of them as
input to check their functionality. Based on the input that was given I knew that I have to
get just two frequency picks as the result and nothing else. Figure 18 shows the result
from both modules. As you see the result is almost the same. For another test example, I
added noise to the same input and get the result that you can see in figure 19.
31
Figure 17. Matlab code written to compute FFT on a combined Sine wave
32
Figure 18. FFT computed using Matlab function and self-implemented one no noise at input
33
Figure 19. FFT computed using Matlab function and self-implemented with noise added to input signal
34
As the final plan was to have different audio files to run and get the result from
spectrogram to compare them for a conclusion, I needed to make this Matlab code more
flexible to be able to have different inputs easily. To make our code more flexible
somehow that we could give it whatever input that we would like to, I’ve added reading
from text file as the input processing part of the code. Now I can give different kind of
input sample text files to it easily. Figure 20 shows the modification to the code for this
purpose.
Figure 20. Matlab code modification to read from a text file to load input of FFT
35
The plan is to process different music files, so one of the things needed is a
module that could read different music files and convert them to binary to be processed
by Matlab spectrogram and FPGA. For FPGA usage purposes, it has to generate the file
in the format of coe file which is RAM initialization file format. So another Matlab script
was written to read a wave or au file and covert it to binary and write it into a text file.
Figure 21 shows the Matlab code for this module.
Figure 21. Matlab code that generates the input file for Matlab and FPGA spectrogram system
36
Based on researches, animals generate different sounds with different frequencies
in each situation. For example the frequency of the sound when a whale screams is
different from its moan call. So spectrogram can be used to detect these different sounds.
For one part of my experiment I gave different animal and birds sound and get the
spectrogram result from it. Figures 22 and 23 Show the spectrogram result for two
different whales. As you can see, they just have frequencies in some area not all the
spectrum. In figure 24, 25, and 26 you also can see the spectrogram of the sound of some
different birds. Now that we have the reference system, the next step is to implement the
FPGA system and see how close it works in compare to the Matlab code that we have.
Figure 22. Spectrogram of Blueatlx Wale sound
37
Figure 23. Spectrogram of BluePacx Wale sound
Figure 24. Spectrogram of Eaglet bird sound
38
Figure 25. Spectrogram of Falcon bird sound
Figure 26. Spectrogram of Mallard Duck quacking sound
39
FPGA based spectrogram system
The target FPGA is a Xilinx Spartan-6 XC6SLX45 FPGA on the Digilent ATLYS
evaluation board. So the software used to compile and implement the system is ISE from
Xilinx Company. VHDL is the RTL language that has been used to implement the
system. Figure 27 shows the structure of the system that has to be implemented.
Figure 27. Block diagram of the spectrogram system implemented in FPGA
There is a 1024 point FFT core which gets its real and imaginary input from two
ROM and after its output is ready we use an amplitude calculator to generate the final
output. There are some input and control signals to make the FFT core works; figure 28
illustrates the timing diagram of the control signals in relation to the imaginary and real
main inputs[40]. All the control signals are generated in the main top module.
40
Figure 28. Timing diagram of the control signals of the FFTcore
Start control signal should have a pulse before the first input sample is valid.
After we send a pulse on start, we have to send all 1024 input samples out one by one at
rising edge of clock. FFT core also after start will count the input index till it gets to the
1024 which is the last sample. Then the busy signal gets high which shows that system is
working and not ready to receive any new input yet. After FFT is done out_valid signal
will go high and output_index will show the index of output sample till it gets to 1024
which is the last output sample. After out_valid goes down we are done with the
transform and we could start another one by sending another start signal to the core. In
figure 29, 30, 31, and 32, you can see the simulation result for the system that show how
the timing signals have been generated and system has worked and produced the output
signals. They also indicate the output control signals generated by simulated FFT after we
gave the input control signals to it. The only thing that this image shows is the correct
way of generating timing signals and for the output check we’ll see some waveforms
generated by chipscope later.
41
Figure 29. FFT Timing input signals simulation result, generated by the VHDL code (part 1)
42
43
FFT Core Implementation in FPGA
There are different implementation options for FFT core on Xilinx FPGAs.
Pipelined Streaming I/O is the one that we've used for our design. This structure offers
continuous processing by using several butterfly processing engines. each butterfly
engine has its own memory to store the input and intermediate processed data. because of
its structure it has the ability of loading input data for the next frame, do the process for
the current frame and unload the result at the same time. Users can continuously load data
and after the output latency continuously receive data from the core. This is one of the
advantage of having pipelined processing system. The other way that it can process the
data is frame by frame with gap between each data frame. in figure 33, you can see the
structure foe this FFT core. This architecture covers FFT point size from to 65536[40].
Figure 33. pipelined streaming IO FFT core implementation in Xilinx FPGA family
44
Now that we have the timing signals correctly generated, we need to check if the
result is like Matlab (our reference system). To be able to check the FPGA system, we
need to setup the chipscope to be able to see the output amplitude in the analog format.
All ILA, ICON, and VIO cores have been added and connected to the main core. For ILA
core we need to determine how many trigger signals we are going to monitor. By trigger
signal, it means the signals that we want to debug or monitor and see if they are working
fine. When we add a trigger to ILA, we also need to define the length of each trigger. I’ve
added about 10 trigger signals like, start, out_valid, busy, input_real, input_imaginary,
output_amplitude, out_index, and input_index. Figure 34 shows the block diagram of the
FPGA system with chipscope cores added for debug purposes.
Now we could give the same input samples that we gave to the Matlab systems
and compare their result. As mentioned we use ROMs to store input samples and then
send them to the FFT core. To initialize a ROM we need to generate a COE file that has
all samples in. We use the output of mentioned Matlab code to initialize the ROM. The
first inout sample given to the FPGA was the samples of the combined Sine wave that we
gave at first to MATLAB. In figures 35 and 36 you could see the result from Matlab
codes and FPGA respectively. This is the fft for the sum of one 50 Hz and 120 Hz
sinusoid waveforms that we had before. Figure 37, 38, 39, and 40 are some other
examples. After running some different input samples, we can say FPGA system is
working fine like our Matlab systems.
45
Figure 34. Block diagram of the system with chipscope cores connection
46
Figure 35. chipscope spectrogram result for sin(50Hz) + sin(120Hz)
Figure 36. Matlab systems spectrogram result for sin(50Hz) + sin(120Hz)
47
Figure 37. chipscope spectrogram result for Eaglet Bird sound
Figure 38. chipscope spectrogram result for Falcon Bird sound
48
Figure 39. chipscope spectrogram result for Mallard Duck quacking
Figure 40. chipscope spectrogram result for horned Owl sound
The experiment that we plan to do using spectrogram is have different audio
samples from different music styles and instruments, compare their result together to see
if we could find an obvious difference between them to use spectrogram to find an
49
especial music type or music instrument out of others without listening to them. The
other thing is to have two different plays of the same music to see if there is any
frequency difference between them. One more experiment is to have music samples in
the same music style but from two different musicians to see if you could distinguish
between them using spectrogram without listening to them.
50
Audio processing using implemented spectrogram
The purpose of implementing the spectrogram is to process music files in
different categories to find out if we could differentiate between them by looking at the
spectrogram result without lessoning to them. As it’ll take a lot of time to process all the
files using FPGA-based one, we’ve used the Matlab version to do the experiment.
The issue is that there are a lot of audio files to process and for each one we have
to run the code 7-8 times so we’ve applied the spectrogram to at least 7-8 part of the file
to have a more precise result. Even if the Matlab runs so fast, it’ll take time to change the
samples numbers, change the file name and run and save the plot each time, so another
script has been written that reads all files automatically, apply the process on different
parts of each file and then plot the result and also save it in the workspace folder, so we
can compare them when all of them are done. Figure 41 and 42 show the written Matlab
module for this purpose.
51
Figure 41. Matlab module to read, process, and save the result automatically (part1)
52
Figure 42. Matlab module to read, process, and save the result automatically (part2)
53
Different music styles that have been tried include Rock, Jazz, Heavy Metal, Rap,
Country music, Techno, and Classic. The other category is different Solo music like Solo
Drum, Solo Guitar, Solo Piano, and Solo Violin. By processing the second category we
wanted to see if there is any obvious difference between different instruments. Final
category is the one that we have the same music with different plays to see if there is any
difference between two different plays. Figures 43- 78 show some samples of
spectrogram result for each music style or instrument.
54
Figure 43. Spectrogram result for Solo Piano (sample 1)
Figure 44. Spectrogram result for Solo Piano (sample 2)
55
Figure 45. Spectrogram result for Solo Guitar (sample 1)
Figure 46. Spectrogram result for Solo Guitar (sample 2)
56
Figure 47. Spectrogram result for Solo Saxophone (sample 1)
Figure 48. Spectrogram result for Solo Saxophone (sample 2)
57
Figure 49. Spectrogram result for Solo Violin (sample 1)
Figure 50. Spectrogram result for Solo Violin (sample 2)
58
Figure 51. Spectrogram result for Solo Drum (sample 1)
Figure 52. Spectrogram result for Solo Drum (sample 2)
59
Figure 53. Spectrogram result for Solo Flute (sample 1)
Figure 54. Spectrogram result for Solo Flute (sample 2)
60
Figure 55. Spectrogram result for Heavy Metal music (sample 1)
Figure 56. Spectrogram result for Heavy Metal music (sample2)
61
Figure 57. Spectrogram result for RAP music (sample 1)
Figure 58. Spectrogram result for RAP music (sample 2)
62
Figure 59. Spectrogram result for Country Music (sample 1)
Figure 60. Spectrogram result for Country Music (sample 2)
63
Figure 61. Spectrogram result for ROCK music (sample 1)
Figure 62. Spectrogram result for ROCK music (sample 2)
64
Figure 63. Spectrogram result for JAZZ music (sample 1)
Figure 64. Spectrogram result for JAZZ music (sample 2)
65
Figure 65. Spectrogram result for Techno music (sample 1)
Figure 66. Spectrogram result for Techno music (sample 2)
66
Figure 67. Spectrogram result for classical music from Beethoven (sample 1)
Figure 68. Spectrogram result for classical music from Beethoven (sample 2)
67
Figure 69. Spectrogram result for classical music from Tchaikovsky (sample 1)
Figure 70. Spectrogram result for classical music from Tchaikovsky (sample 2)
68
Figure 71. Spectrogram result for classical music from Mozart (sample 1)
Figure 72. Spectrogram result for classical music from Mozart (sample 2)
69
Figure 73. Spectrogram result for classical music from Vivaldi (sample 1)
Figure 74. Spectrogram result for classical music from Vivaldi (sample 2)
70
Figure 75. Spectrogram result for classical music from Bach (sample 1)
Figure 76. Spectrogram result for classical music from Bach (sample 2)
71
Figure 77. Spectrogram result for classical music from Schubert (sample 1)
Figure 78. Spectrogram result for classical music from Schubert (sample 2)
72
Spectrogram Audio result analysis
Based on the experiment that has been done, it seems like each instrument has just some
frequencies and in a mixed music the frequency result depends on what kind of
instrument are being played at that time. If you look at the spectrogram result for solo
music, it’s obvious that Drum is one of the instruments which cover a wider range of
frequencies, so in Jazz, Rock, Heavy Metal, Country or other music types that use drum
we’ll see this wide range of frequency in the output of spectrogram. Piano is one of the
instruments which has low range of frequencies, violin also has wider range of
frequencies than piano or other classical instruments. So in a classical mixed music if the
spectrogram has wider range of outputs we could conclude that Violin is also being
played.
In classical music the frequency coverage in instruments from low to high include Piano,
Saxophone, flute, and Violin. But even Violin coverage is much lower than Drum, so it
could be used to differentiate between classical and other music types. In every music
style we could approximately tell what kinds of instrument have been used using the
spectrogram output. Because Classical music instruments are low frequency ones, you’ll
see that in classical music spectrogram we don’t see those high frequencies that exist in
Rock, Jazz or other music types that uses drum.
One of the experiments is to compare two different plays of an identical music to see if it
makes any difference or not. Figure 79 shows the result for adoration of the earth and
figure 80 illustrates the result for kiss of the earth in Matlab. By looking at the result they
look almost the same and the parts that are different have really small amplitude that can
be ignored in comparison to other frequencies. So the conclusion is
73
different plays can’t cause the result to be different unless they use different instruments.
Figure 79. Two different plays of Adoration of the earth
74
Figure 80. Two different plays of Kiss of the earth
75
Image processing using 2D FFT Matlab and FPGA
2D FFT implementation using Matlab
The other part of the project is to implement 2D FFT to be able to process images and
see the effect of the FFT on them. Our base core here is one dimensional FFT that we’ve
implemented using both Matlab and VHDL coding. The goal is to use this base core and
build the 2D FFT. The algorithm for 2D FFT using one dimensional core is to first apply
one dimensional FFT to every row of the image and write them on a matrix and then
apply the FFT to every columns of the matrix and the matrix resulted from the second
FFT application is the FFT of the whole image. Figure 81 and 82 show the Matlab code
that has been written for our 2D FFT. It also generate the initialization file for RAM in
FPGA implementation. In figure 83, 84, and 85 you’ll see the result of our 2D FFT on
some sample pictures.
76
Figure 81. Matlab module for 2D FFT on image (part1)
77
Figure 82. Matlab module for 2D FFT on image (part2)
78
Figure 83. Image and its 2D FFT result in
Figure 84. Image and its 2D FFT result in Matlab
Figure 85. Image and its 2D FFT result in Matlab
As you see in figure 83, we just have two white spots in the whole black page and
the reason is that this picture has more dc component than existing frequencies. Number
of frequencies corresponds to the number of existing pixels in an image.
79
2D FFT implementation on FPGA
After implementing Matlab code as the reference system, now we can continue on
designing the same system on FPGA. One important change that has to be done on FPGA
is exchange our ROMs with RAMs. Because we need to do different FFTs on rows and
columns, so we have to be able to overwrite the components to have the final result ready
on the same RAMs that we have inputs. The reason is that FPGA resources are limited
and we can’t use two other RAMs for the output and using the same RAM components is
more efficient and efficiency is one of the major factors that we always have to consider
when we work with FPGAs. The other thing that we have to consider is the width of the
each RAM word. Out input width is 12 bits but when we apply the FFT the result is
going be bigger, so we have to instance our RAMs with bigger word width so we could
fit the output there.
Because we have to apply FFT to all rows and then columns of our image, so there
should be a control unit that generate RAM address and FFT control signals and write the
results in the correct RAM location till we have the final result ready. First we have to
have a binary file that has all the image information in it. To do that, a Matlab file has
been written that reads a picture and generates image binary file for Matlab and RAM
initial file for FPGA. Figure 86 and 87 show the Matlab code.
80
Figure 86. 2D FFT Matlab code (part 1)
81
Figure 87. 2D FFT Matlab code (part 2)
82
All image information is in two RAMs, but as the RAMs are not like Matrix so
we have to store the image in series and have a formula for the row and column
addresses. The relation between each row and its relevant RAM address is like below:
row address = (i*64)+j (i = 0 to 63 and for each i, j changes from 0 to 63)
column address = (j*64) + i (i = 0 to 63 and for each i, j changes from 0 to 63)
for row = 0 components are in address 0 to 63, and for row one they start from 64 to
127, for column = 0 components are in addresses 0, 64, 128, 192, 256, ..., and 4032, and
next column addresses is 1, 65, 129, 193, 257, ..., and 4033. The major job that the
control unit does is to generate the start signal and then generate the Ram read address
based on the row number and read the whole RAM, then wait for the FFT on the row is
done and generate the same addresses to overwrite the result on the same row. Then do
the same for the next row till the last row. When the process on rows is done, it will
generate addresses for each column and write the result on the same columns again and
when the columns are done it’ll generate a signal which indicates the end of conversion.
Because of the limited number of resources that we have on the used FPGA, 64 x 64
images has been used, so control unit has to do the transform 64 x 64 times till the result
is ready. Figure 88, 89, 90, and 91 show some detail about simulation result for the image
processing using our designed FPGA-based system and the way that the system does the
process. after running the simulation two text files are generated and we give it to Matlab
to show it as image, so we can check the result.
83
Figure 88. fft implementation usinf VHDL (part1)
84
Figure 89. 2D fft implementation usinf VHDL (part2)
85
86
87
Conclusion
Because of its efficiency, FFT is one of the important introduced transform
algorithms. It has different application like image processing, noise cancelation, image
quality improvement, and audio processing. FFT on images show the existing frequencies
that are significant in compare to the DC component of the image. Number of frequencies
exist in the result depends on the number of pixels in the picture. For instance, if we have
an image of just parallel white and black lines we'll have two white dots on the whole
black background as the result. Because it just has two pixels and their amplitude is much
lower than DC components in the image. So almost all the point are black except those
two frequencies.
One other application of the FFT is audio processing. Spectrogram is a tool that
uses FFT as the main core to generate the amplitude of the existing frequencies in the
input waveform. By applying spectrogram to different animal's voice, we can
differentiate between their different calls like scream, or moan. It's because each call has
its own frequency pattern.
Spectrogram also has been applied to different solo musical instrument audio files
like Guitar, Piano, Violin, and Drum. By comparing the result we could see that each
instrument has kind of especial frequency range that can be used to detect that particular
instrument without listening to the audio file. The other benefit could be that by having
the result of the spectrogram on different part of a mixed audio file we could detect some
of the instruments that have been played.
88
Drum is one of the instruments that has a really high frequency range, violin is the
high frequency range in classical music, but its range is much lower than drum. So
whenever in a mixed signal we see a really high range of the frequency change we could
say Drum is played there. Or if we see that the result just falls into the lower frequency
ranges we can consider it as the piano. the other usage could be as a detector for classical
music, because we don't use drum in classical music, so its spectrogram result is always
lower than others even if we use violin (which is one of the high frequency range
instruments for classical music).
89
References
1. http://en.wikipedia.org/wiki/Fourier_transform , November 2014
2. http://www.cv.nrao.edu/course/astr534/FourierTransforms.html , September 2010
3. http://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/ , December
2012
4. http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf , August 2007
5. http://en.wikipedia.org/wiki/Discrete_Fourier_transform , November 2014
6. http://en.wikipedia.org/wiki/Fast_Fourier_transform , November 2014
7. http://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm , November 2014
8. http://sip.cua.edu/res/docs/courses/ee515/chapter08/ch8-2.pdf , July 2012
9. https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/ , August 2013
10. http://www.wisdom.weizmann.ac.il/~naor/COURSE/fft-lecture.pdf , August 2005
11. http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf , August 2007
12. http://perso.limsi.fr/vezien/PAPIERS_ACS/cse-fft.pdf , October 1999
13. http://www.cs.princeton.edu/courses/archive/fall99/cs323/assign/ass5/ass5.pdf , August 1999
14. http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm , October 2003
15. https://www.projectrhea.org/rhea/index.php/Speech_Spectrogram , September 2009
16. http://en.wikipedia.org/wiki/Spectrogram , November 2014
17. http://www.listenforwhales.org/page.aspx?pid=444 , April 2013
18. http://en.wikipedia.org/wiki/Field-programmable_gate_array , November 2014
19. http://en.wikipedia.org/wiki/Application-specific_integrated_circuit , August 2014
20. http://community.brocade.com/t5/Service-Providers/FPGA-or-ASIC-Pro-s-amp-Con-s-of-
Each-Technology/ba-p/709 , March 2013
21. http://asic-soc.blogspot.com/2007/11/what-is-difference-between-fpga-and_06.html ,
November 2007
22. http://www.xilinx.com/fpga/asic.htm , February 2014
90
23. http://only-vlsi.blogspot.com/2008/05/fpga-vs-asic.html , May 2008
24. https://www.doc.ic.ac.uk/~wl/teachlocal/arch2/killasic.pdf , January 2001
25. http://www.cis.upenn.edu/~lee/06cse480/lec-fpga.pdf , July 2006
26. http://www.xilinx.com/fpga/ , August 2014
27. http://isl.stanford.edu/groups/elgamal/abbas_publications/J029.pdf , February 2995
28. http://www.xilinx.com/itp/xilinx10/isehelp/ise_c_fpga_design_flow_overview.htm , June 2008
29. http://amber.feld.cvut.cz/fpga/stazene_materialy/basics_of_fpga_design.pdf , December 2003
30. http://cds.cern.ch/record/1100537/files/p231.pdf , June 2007
31. http://www.xilinx.com/tools/cspro.htm , April 2014
32. http://www.arl.wustl.edu/projects/fpx/chipscope-6-rev1.pdf , July 2004
33. http://www-mtl.mit.edu/Courses/6.111/labkit/chipscope.shtml , February 2007
34. http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_3/ug750.pdf , November
2010
35. http://www.xilinx.com/products/intellectual-property/chipscope_ila.htm , August 2014
36. http://www.xilinx.com/products/intellectual-property/chipscope_icon.htm , August 2014
37. http://www.xilinx.com/products/intellectual-property/chipscope_vio.htm , August 2014
38. http://www-inst.eecs.berkeley.edu/~cs150/fa13/resources/ChipScope.pdf , February 2009
39. http://www.ee.ryerson.ca/~lkirisch/ele758/handouts/Tutorial3_ChipScope_Pro_VIO_BlockRA
M.pdf , October 2012
40. XILINX LogiCORE IP Fast Fourier Transform v7.1 datasheet, March 2011
91
Appendix A
Matlab code for reading WAV audio file format and generate ROM initialization file and
matlab binary input file to calculate FFT
% gives us a double array of 1024 * 2 for left and right channel

samples
[x, fs, nbits]= wavread('Rudy_rooster_crowing-Shelley-1948282641.wav',
[16385, 17408] );
% get the integer value of samples

x2 = round(x * 2^11); %( x * 2^nbits/2) + (2^nbits/2);
left_x = x(:,1); % save left channel samples in double

left_x2 = x2(:,1); % save left channel samples in integer
% make the ceo file to initialize the ROM

fileID = fopen('rom_input_Rudy_rooster_crowing-Shelley-1948282641.txt',
'w');
i = 1;
fprintf(fileID,'memory_initialization_radix= 10; \n');

fprintf(fileID,'memory_initialization_vector= \n');
while i <= 1023

fprintf(fileID,'%d,\n',left_x2(i));
i = i + 1;
end
fprintf(fileID,'%d;\n',left_x2(1024));
% make the input text file to get used by matlab to compare the result
with FPGA
fileID1 = fopen('matlab_input_Rudy_rooster_crowing-Shelley-
1948282641.txt', 'w');
i = 1;
fprintf(fileID1,'%d \n', fs); %sampling frequency

fprintf(fileID1,'%d \n', 1024); %number of sample inputs
while i <= 1024

fprintf(fileID1,'%d \n',left_x2(i));
i = i + 1;
92
end
Matlab code to apply FFT on the combined sinusoid waves
Fs = 1000; % Sampling frequency

T = 1/Fs; % Sample time .001
L = 1000; % Length of signal
L1 = 1024; % Length of signal
t = (0:L-1)*T; % Time vector
t1 = (0:L1-1)*1; % Time vector
% Sum of a 50 Hz sinusoid and a 120 Hz sinusoid
x = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
y = x + 2*randn(size(t)); % Sinusoids plus noise
infile = fopen ('input_sig.txt', 'w');
%Read the FFT result files generated by FPGA

real_file = fopen('real_out.txt');
img_file = fopen('img_out.txt');
A = fscanf(real_file,'%d');
B = fscanf(img_file,'%d');
A = A / 2048;
B = B / 2048;
d = A + i * B;
c = sqrt(A.^2 + B.^2);
i = 1;
while (i <= 1000 )
fprintf (infile, '%f \n', y(i));
i = i + 1;
end
NFFT = 2^nextpow2(L); % Next power of 2 from length of y

Y = fft(y,NFFT)/L;
% fft calculation
k=1;
while (k <= 1000)
l=1;
fft_r (k) = 0;
while ( l <= 1000)
e(l) = y(l) * exp((-2*1i*pi*k*l)/1000);
fft_r (k) = fft_r(k) + e(l);
l = l + 1;
end
k = k + 1;
end
Y_fft_r = fft_r (1:1000)/l;
93
f = Fs/2*linspace(0,1,NFFT/2+1);
% compare all three
plot(f,2*abs(Y (1:NFFT/2+1))); % FFT result using FFT matlab function
Hold on;
plot(f,2*abs(Y_fft_r(1:NFFT/2+1))); % self generated FFT matlab code
Hold on;
plot(f,c); % FFT result from FPGA
title('Single-Sided Amplitude Spectrum of y(t)')

xlabel('Frequency (Hz)')
ylabel('|Y(f)|')
Matlab code for reading the binary input files and plot the result for Matlab and FPGA
clear;
matlab_in = fopen('matlab_input_mallard_duck_quacking.txt');
Fs= fscanf(matlab_in,'%d', 1); % Sampling frequency
T = 1/Fs; % Sample time .001
Length = fscanf(matlab_in,'%d', 1); % Length of signal
t = (0:Length-1)*T; % Time vector
% reading the waveform input samples
y = fscanf(matlab_in,'%d');
NFFT = 2^nextpow2(Length); % Next power of 2 from length of y

Y = fft(y,NFFT)/Length;
% fft calculation
k=1;
while (k <= Length)
l=1;
fft_r (k) = 0;
while ( l <= Length)
e(l) = y(l) * exp((-2*1i*pi*k*l)/Length);
fft_r (k) = fft_r(k) + e(l);
l = l + 1;
end
k = k + 1;
end
Y_fft_r = fft_r (1:Length)/l;
f = Fs/2*linspace(0,1,NFFT/2+1);
% read the rtl sim result and plot to compare with matlab
real_file = fopen('real_out.txt');
img_file = fopen('img_out.txt');
A = fscanf(real_file,'%d');
B = fscanf(img_file,'%d');
A = A / 2048;
B = B / 2048;
d = A + 1i * B;
c = sqrt(A.^2 + B.^2);
c1 = A + 1i* B;
plot(f,2*abs(c1(1:NFFT/2+1)), 'b')
hold on
plot(f,2*abs(Y_fft_r(1:NFFT/2+1)), 'r')
hold on
plot(f,2*abs(Y(1:NFFT/2+1)), 'g')
94
title('Single-Sided Amplitude Spectrum of y(t)')
xlabel('Frequency (Hz)')
ylabel('|Y(f)|')
Matlab code for automatically read the wave files, process and save the result
%number of existing files

numfiles = 8;
for k = 1:numfiles
myfilename = sprintf('ROCK_SOLO_%d.wav', k);
n1 = 1;
n2 = 100000;
for i = 1:7
[x, fs, nbits]= wavread(myfilename, [n1,n2]);
x1 = x(:,1);
y = fft(x1);
y1 = abs(y);
f = fs/2*linspace(0,1,100000/2+1);
h= figure(((k-1)*7)+i);
amplitude = abs(y(1:100000/2+1));
avg = 0;
n= 50001;
for j = 1:50001
if amplitude(j) <= 10
n = n -1;
else
avg = avg +amplitude(j);
end
end
avg = avg /n;
threshold = avg;
for j = 1:50001
if amplitude(j) < threshold
amplitude1(j)= 0;
else
amplitude1(j)= amplitude(j);
end
end
plot(f, amplitude1(1:100000/2+1));
% will create CONTRY_MIX_RESULT_1_1
saveas(h,sprintf('ROCK_SOLO_RESULT_%d_%d.png',k, i));
% will create CONTRY_MIX_RESULT_1_1
saveas(h,sprintf('ROCK_SOLO_RESULT_%d_%d.fig',k, i));
n1 = n1 + 100000;
n2 = n2 + 100000;
end
end
95
Matlab code for 2D FFT on image
% read the gray scale image

I = imread('1-1.jpg');
% getting an approximate of the threshold for the image

level = graythresh(I);
BW = im2bw(I,level);
i = 1;
while ( i <= 64)
j= 1;
while ( j <= 64)
Y(i,j) = 0;
x1(i,j) = 0;
j= j + 1;
end
i = i + 1;
end
i = 1;
while ( i <= 64)
j= 1;
while ( j <= 64)
x(i,j) = BW(i,j);
j = j + 1;
end
y = fft(x(i, 1:64));
j= 1;
while ( j <= 64)
Y(i,j) = y(1,j);
j = j + 1;
end
i = i + 1;
end
i = 1;
while ( i <= 64)
j= 1;
while ( j <= 64)
x1(i,j) = Y(i,j);
96
j = j + 1;
end
y = fft(x1(1:64, i));
Y1(1:64,i) = y(1:64,1);
i = i + 1;
end
grayImage = uint8(Y1);
imshow(grayImage);
Appendix B
FFT VHDL code
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.NUMERIC_STD.all;
use IEEE.std_logic_unsigned.all;
LIBRARY UNISIM;
USE UNISIM.VComponents.ALL;
ENTITY spectogram IS
PORT (
clk : IN STD_LOGIC ;
rst : IN STD_LOGIC;
rfd : OUT STD_LOGIC;
busy : OUT STD_LOGIC;
edone : OUT STD_LOGIC;
done : OUT STD_LOGIC;
dv : OUT STD_LOGIC;
xn_index : OUT STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
xk_index : OUT STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
xk_re : OUT STD_LOGIC_VECTOR ( 22 DOWNTO 0 );
xk_im : OUT STD_LOGIC_VECTOR ( 22 DOWNTO 0 );
amplitude: OUT STD_LOGIC_VECTOR (22 DOWNTO 0)
);
END spectogram;
ARCHITECTURE spectogram_arch OF spectogram IS
COMPONENT xfft
port (
clk : in STD_LOGIC := 'X';
97
start : in STD_LOGIC := 'X';
fwd_inv : in STD_LOGIC := 'X';
fwd_inv_we : in STD_LOGIC := 'X';
rfd : out STD_LOGIC;
busy : out STD_LOGIC;
edone : out STD_LOGIC;
done : out STD_LOGIC;
dv : out STD_LOGIC;
xn_re : in STD_LOGIC_VECTOR ( 11 downto 0 );
xn_im : in STD_LOGIC_VECTOR ( 11 downto 0 );
xn_index : out STD_LOGIC_VECTOR ( 9 downto 0 );
xk_index : out STD_LOGIC_VECTOR ( 9 downto 0 );
xk_re : out STD_LOGIC_VECTOR ( 22 downto 0 );
xk_im : out STD_LOGIC_VECTOR ( 22 downto 0 )
);
END COMPONENT;
COMPONENT ILA_CORE
PORT (
CONTROL: INOUT STD_LOGIC_VECTOR(35 DOWNTO 0);
CLK: IN STD_LOGIC;
TRIG0: IN STD_LOGIC_VECTOR(11 DOWNTO 0);
TRIG2: IN STD_LOGIC_VECTOR(0 TO 0);
TRIG9: IN STD_LOGIC_VECTOR(0 TO 0));
END COMPONENT;
COMPONENT ICON_CORE
PORT (
CONTROL0: INOUT std_logic_vector(35 DOWNTO 0);
CONTROL1: inout std_logic_vector(35 downto 0));
END COMPONENT;
COMPONENT VIO_core
port (
CONTROL: inout std_logic_vector(35 downto 0);
CLK: in std_logic;
SYNC_IN: in std_logic_vector(7 downto 0);
SYNC_OUT: out std_logic_vector(7 downto 0));
END COMPONENT;
98
COMPONENT r2p_corproc
GENERIC(DATA_WIDTH : INTEGER := 27;
PIPE_DEPTH : INTEGER := 15;
PRECISION : INTEGER := 27);
PORT( clk : IN STD_LOGIC;

ce: IN STD_LOGIC;
Xin: IN SIGNED(DATA_WIDTH-1 DOWNTO 0);
Yin: IN SIGNED(DATA_WIDTH-1 DOWNTO 0);
Rout : OUT unsigned(DATA_WIDTH-1 DOWNTO 0));
END COMPONENT;
COMPONENT mem
PORT (
clka : IN STD_LOGIC;
addra : IN STD_LOGIC_VECTOR(9 DOWNTO 0);
douta : OUT STD_LOGIC_VECTOR(11 DOWNTO 0)
);
END COMPONENT;
COMPONENT clock_divider_DCM
PORT
(-- Clock in ports
CLK_IN1 : in std_logic;
-- Clock out ports
CLK_OUT1 : out std_logic;
-- Status and control signals
RESET : in std_logic;
LOCKED : out std_logic
);
END COMPONENT;
SIGNAL control_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL contro2_word : STD_LOGIC_VECTOR(35 DOWNTO 0);
SIGNAL dv_temp : STD_LOGIC;
SIGNAL xk_re_temp : STD_LOGIC_VECTOR(22 DOWNTO 0);
SIGNAL xk_im_temp : STD_LOGIC_VECTOR(22 DOWNTO 0);
SIGNAL dv_temp_vetor : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL amplitude_out : unsigned ( 22 DOWNTO 0);
SIGNAL start : STD_LOGIC := '0';
99
SIGNAL adrs : STD_LOGIC_VECTOR (9 DOWNTO 0):=
(others => '0');
SIGNAL xn_re : STD_LOGIC_VECTOR ( 11 DOWNTO 0 );
SIGNAL xn_im : STD_LOGIC_VECTOR ( 11 DOWNTO 0 );
SIGNAL start_sent : STD_LOGIC := '0';
SIGNAL counter : INTEGER RANGE 0 to 15;
SIGNAL fwd_inv : STD_LOGIC;
SIGNAL fwd_inv_we : STD_LOGIC;
SIGNAL locked : STD_LOGIC;
SIGNAL clk100 : STD_LOGIC;
SIGNAL clk12_5 : STD_LOGIC;
SIGNAL clk_fft : STD_LOGIC;
SIGNAL DCM_reset : STD_LOGIC;
SIGNAL fft_reset : STD_LOGIC;
SIGNAL busy_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL rst_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL clk_fft_vector: STD_LOGIC_VECTOR (0 TO 0);
SIGNAL busy_temp : STD_LOGIC;
SIGNAL clk_in : STD_LOGIC;
SIGNAL done_temp : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL start_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL xn_index_tmp : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
SIGNAL xk_index_tmp : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
begin
DCM_reset <= rst;

fft_reset <= rst;
clk_fft <= clk12_5;
clk_in <= clk;
DCM_inst: clock_divider_DCM
PORT MAP
(-- Clock in ports
CLK_IN1 => clk_in,
-- Clock out ports
CLK_OUT1 => clk100,
CLK_OUT2 => clk50,
CLK_OUT3 => clk25,
CLK_OUT4 => clk12_5,
100
RESET => '0',--DCM_reset,
LOCKED => locked
);
-- Stimulus process
PROCESS (fft_reset, clk_fft)
BEGIN
if (fft_reset = '1') then
fwd_inv <= '1';
fwd_inv_we <= '1';
counter <= 0;
elsif (rising_edge (clk_fft)) then
if ( counter = 1) then
fwd_inv_we <= '1';
end if;
fwd_inv_we <= '0';
end if;
if ( counter <10 ) then
counter <= counter + 1;
end if;
end if;
END PROCESS;
start_vector(0) <= start;

process (clk_fft, fft_reset)
begin
if ( fft_reset = '1') then
start <= '0';
adrs <= (others => '0');
start_sent <= '0';
elsif ( rising_edge (clk_fft) ) then
if ( start /= '1' and start_sent /= '1') then
start_sent <= '1';
start <= '1';
else
start <= '0';
101
end if;
if ( adrs < 1023) and (start_sent = '1') then
adrs <= adrs + 1;
end if;
end if;
end process;
ROM: mem
PORT MAP(
clka => clk_fft,
addra => adrs,
douta => xn_re
);
xn_im <= (others => '0');
fft_inst: xfft
PORT MAP (
clk => clk_fft,
start => start,
fwd_inv => fwd_inv,
fwd_inv_we => fwd_inv_we,
rfd => rfd,
busy => busy_temp,
edone => edone,
done => done_temp(0),
dv => dv_temp,
xn_re => xn_re,
xn_im => xn_im,
xn_index => xn_index,
xk_index => xk_index,
xk_re => xk_re_temp,
xk_im => xk_im_temp
);
amplitude_inst: r2p_corproc
GENERIC MAP(
DATA_WIDTH => 23,
PIPE_DEPTH => 15,
PRECISION => 23)
PORT MAP(
clk => clk_fft,
ce => '1',
Xin => SIGNED(xk_re_temp),
Yin => SIGNED(xk_im_temp),
Rout => amplitude_out
);
ILA_inst: ILA_CORE
102
port map(
CONTROL => control_word,
CLK => clk100,
TRIG0 => xn_re,
TRIG1 => xn_im,
TRIG2 => dv_temp_vetor,
TRIG3 => xk_re_temp,
TRIG4 => xk_im_temp,
TRIG5 => STD_LOGIC_VECTOR(amplitude_out),
TRIG6 => rst_vector,
TRIG7 => busy_vector,
TRIG8 => done_temp,
TRIG9 => start_vector );
amplitude <= STD_LOGIC_VECTOR(amplitude_out);
ICON_inst: ICON_CORE
port map(
CONTROL0 => control_word,
CONTROL1 => contro2_word
);
VIO_inst: VIO_core
port map(
CONTROL => contro2_word,
CLK => clk100,
SYNC_IN => ("0000000"&dv_temp),
SYNC_OUT => open);
dv_temp_vetor(0) <= dv_temp;

busy_vector(0) <= busy_temp;
rst_vector(0) <= rst;
clk_fft_vector(0) <= clk_fft;
busy <= busy_temp;
dv <= dv_temp;
xk_re <= xk_re_temp;
xk_im <= xk_im_temp;
done <= done_temp(0);
end spectogram_arch;
103
2D FFT VHDL code for image processing
LIBRARY IEEE;
use IEEE.std_logic_unsigned.all;
LIBRARY UNISIM;
USE UNISIM.VComponents.ALL;
ENTITY spectogram IS
PORT (
clk : IN STD_LOGIC ;
rst : IN STD_LOGIC;
--start : IN STD_LOGIC := 'X';
rfd : OUT STD_LOGIC;
busy : OUT STD_LOGIC;
edone : OUT STD_LOGIC;
done : OUT STD_LOGIC;
dv : OUT STD_LOGIC;
--xn_re : IN STD_LOGIC_VECTOR ( 11 DOWNTO 0 );
--xn_im : IN STD_LOGIC_VECTOR ( 11 DOWNTO 0 );
xn_index : OUT STD_LOGIC_VECTOR ( 5 DOWNTO 0 );
xk_index : OUT STD_LOGIC_VECTOR ( 5 DOWNTO 0 );
xk_re : OUT STD_LOGIC_VECTOR ( 16 DOWNTO 0 );
xk_im : OUT STD_LOGIC_VECTOR ( 16 DOWNTO 0 );
amplitude: OUT STD_LOGIC_VECTOR (16 DOWNTO 0)
);
END spectogram;
ARCHITECTURE spectogram_arch OF spectogram IS
COMPONENT xfft
port (
clk : in STD_LOGIC := 'X';
start : in STD_LOGIC := 'X';
fwd_inv : in STD_LOGIC := 'X';
fwd_inv_we : in STD_LOGIC := 'X';
rfd : out STD_LOGIC;
busy : out STD_LOGIC;
104
edone : out STD_LOGIC;
done : out STD_LOGIC;
dv : out STD_LOGIC;
xn_re : in STD_LOGIC_VECTOR ( 9 downto 0 );
xn_im : in STD_LOGIC_VECTOR ( 9 downto 0 );
xn_index : out STD_LOGIC_VECTOR ( 5 downto 0 );
xk_index : out STD_LOGIC_VECTOR ( 5 downto 0 );
xk_re : out STD_LOGIC_VECTOR ( 16 downto 0 );
xk_im : out STD_LOGIC_VECTOR ( 16 downto 0 )
);
END COMPONENT;
COMPONENT ILA_CORE
PORT (
CONTROL: INOUT STD_LOGIC_VECTOR(35 DOWNTO 0);
CLK: IN STD_LOGIC;
TRIG9: IN STD_LOGIC_VECTOR(0 TO 0));
END COMPONENT;
COMPONENT ICON_CORE
PORT (
CONTROL0: INOUT std_logic_vector(35 DOWNTO 0);
CONTROL1: inout std_logic_vector(35 downto 0));
END COMPONENT;
COMPONENT VIO_core
port (
CONTROL: inout std_logic_vector(35 downto 0);
CLK: in std_logic;
SYNC_IN: in std_logic_vector(7 downto 0);
SYNC_OUT: out std_logic_vector(7 downto 0));
END COMPONENT;
COMPONENT r2p_corproc
GENERIC(DATA_WIDTH : INTEGER := 27;
PIPE_DEPTH : INTEGER := 15;
PRECISION : INTEGER := 27);
PORT( clk : IN STD_LOGIC;
ce : IN STD_LOGIC;
Xin : IN SIGNED(DATA_WIDTH-1 DOWNTO 0);
Yin : IN SIGNED(DATA_WIDTH-1 DOWNTO 0);
Rout : OUT unsigned(DATA_WIDTH-1 DOWNTO 0));
END COMPONENT;
105
COMPONENT mem
PORT (
clka : IN STD_LOGIC;
wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0);
addra : IN STD_LOGIC_VECTOR(11 DOWNTO 0);
dina : IN STD_LOGIC_VECTOR(16 DOWNTO 0);
douta : OUT STD_LOGIC_VECTOR(16 DOWNTO 0)
);
END COMPONENT;
COMPONENT clock_divider_DCM
PORT
(-- Clock in ports
CLK_IN1 : in std_logic;
-- Clock out ports
RESET : in std_logic;
LOCKED : out std_logic
);
END COMPONENT;
SIGNAL control_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL contro2_word : STD_LOGIC_VECTOR(35 DOWNTO 0);
SIGNAL dv_temp : STD_LOGIC;
SIGNAL xk_re_temp : STD_LOGIC_VECTOR(16 DOWNTO 0);
SIGNAL xk_im_temp : STD_LOGIC_VECTOR(16 DOWNTO 0);
SIGNAL xn_re : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
SIGNAL xn_im : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );
SIGNAL adrs : STD_LOGIC_VECTOR (11 DOWNTO 0):= (others => '0');
SIGNAL start_sent : STD_LOGIC := '0';
SIGNAL start_trans : STD_LOGIC;
SIGNAL wr_en_vec : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL re_mem_out : STD_LOGIC_VECTOR(16 DOWNTO 0);
SIGNAL im_mem_out : STD_LOGIC_VECTOR(16 DOWNTO 0);
SIGNAL first_trans : STD_LOGIC;
SIGNAL row_offset : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');
SIGNAL col_offset : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');
SIGNAL adrs_plus_1: STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');
SIGNAL fft_on_row : STD_LOGIC;
SIGNAL adrs_plus_1_by_64: STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others =>
'0');
SIGNAL ram_adrs : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');
SIGNAL first_transaction: STD_LOGIC;
SIGNAL image_fft_done : STD_LOGIC;
106
SIGNAL width_counter : INTEGER RANGE 0 to 100;
SIGNAL dv_temp_vetor : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL amplitude_out : unsigned ( 16 DOWNTO 0);
SIGNAL counter : INTEGER RANGE 0 to 15;
SIGNAL fwd_inv : STD_LOGIC;
SIGNAL fwd_inv_we : STD_LOGIC;
SIGNAL locked : STD_LOGIC;
SIGNAL clk_fft : STD_LOGIC;
SIGNAL DCM_reset : STD_LOGIC;
SIGNAL fft_reset : STD_LOGIC;
SIGNAL busy_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL rst_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL clk_fft_vector: STD_LOGIC_VECTOR (0 TO 0);
SIGNAL busy_temp : STD_LOGIC;
SIGNAL clk_in : STD_LOGIC;
SIGNAL done_temp : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL start_vector : STD_LOGIC_VECTOR (0 TO 0);
SIGNAL after_rst : STD_LOGIC;
SIGNAL dv_NE : STD_LOGIC;
SIGNAL start_sent_RE : STD_LOGIC;
SIGNAL start_sent_q : STD_LOGIC;
SIGNAL dv_temp_q : STD_LOGIC;
SIGNAL start_sent_q_q : STD_LOGIC;
begin
DCM_reset <= rst;

fft_reset <= rst;
clk_fft <= clk12_5;
clk_in <= clk;
wr_en_vec(0) <= dv_temp;
DCM_inst: clock_divider_DCM
PORT MAP
(-- Clock in ports
CLK_IN1 => clk_in,
-- Clock out ports
CLK_OUT1 => clk100,
CLK_OUT2 => clk50,
CLK_OUT3 => clk25,
RESET => '0',--DCM_reset,
LOCKED => locked
);
107
-- Stimulus process
PROCESS (fft_reset, clk_fft)
BEGIN
if (fft_reset = '1') then
fwd_inv <= '1';
fwd_inv_we <= '1';
counter <= 0;
elsif (rising_edge (clk_fft)) then
fwd_inv_we <= '1';
end if;
fwd_inv_we <= '0';
end if;
if ( counter <10 ) then
counter <= counter + 1;
end if;
end if;
END PROCESS;
start_vector(0) <= start_trans;

start_trans <= start_sent_RE;
first_transaction <= first_trans and start_sent_q_q;
process (clk_fft, rst)

begin
if ( rst = '1') then

start_sent <= '0';
first_trans <= '1';
row_offset <= (others => '0');
col_offset <= (others => '0');
adrs_plus_1 <= "000000000001";
fft_on_row <= '1';
image_fft_done <= '0';
after_rst <= '1';
width_counter <= 0;
dv_NE <= '0';
start_sent_RE <= '0';
elsif ( rising_edge (clk_fft)) then
108
if (image_fft_done = '0') then
after_rst <= '0';
start_sent_q <= start_sent;
start_sent_q_q <= start_sent_q;
if (start_sent = '1' and start_sent_q = '0') then

else
end if;
dv_temp_q <= dv_temp;

if ( dv_temp = '1') then
first_trans <= '0';
end if;
if (dv_temp = '0' and dv_temp_q = '1') then

dv_NE <= '1';
else
dv_NE <= '0';
end if;
if ( after_rst = '1' or dv_NE = '1' ) then

start_sent <= '1';
elsif (width_counter = 63) then
start_sent <= '0';
end if;
if (start_sent = '1') then
width_counter <= width_counter + 1;
else
width_counter <= 0;
end if;
if (dv_NE = '1') then
if (fft_on_row = '1' ) then
-- go to next row
row_offset <= row_offset + 64;
else
col_offset <= col_offset + 1;
end if;
end if;
if ( fft_on_row = '1') then

if ( adrs <= 63 and start_sent = '1') then
adrs <= adrs + 1;
adrs_plus_1 <= adrs_plus_1 + 1;
ram_adrs <= adrs + row_offset ;
end if;
if (adrs <= 63 and dv_temp = '1') then
adrs <= adrs + 1;
ram_adrs <= adrs_plus_1 + row_offset ;
end if;
else
if ( adrs <= 63 and start_sent = '1') then
adrs <= adrs + 1;
109
adrs_plus_1_by_64 <= adrs_plus_1 (5
downto 0) &"000000";
-- offset + ( adrs + 1) * 64
ram_adrs <= col_offset +
((adrs (5 downto 0) &"000000"));
end if;
if ( adrs <= 63 and dv_temp = '1')then

adrs <= adrs + 1;
adrs_plus_1_by_64 <= adrs (5 downto 0)
&"000000";
-- offset + ( adrs + 1) * 64
ram_adrs <= col_offset + ((adrs_plus_1 (5
downto 0) &"000000"));
end if;
end if;
if (row_offset = 4032 and dv_NE = '1') then

fft_on_row <= '0'; -- start fft on the column
end if;
if (col_offset = 63 and ram_adrs = 4095) then

image_fft_done <= '1';
end if;
if (adrs = 64 ) then
adrs_plus_1 <= "000000000001";
if (fft_on_row = '1' ) then
ram_adrs <= row_offset;
else
ram_adrs <= col_offset;
end if;
end if;
end if;
end if;
end process;
real_part_mem: mem
PORT MAP(
clka => clk_fft,
wea => wr_en_vec,
addra => ram_adrs,
dina => xk_re_temp,
douta => re_mem_out
);
110
xn_re <= re_mem_out(9 downto 0) when (first_transaction = '1')
else re_mem_out(16 downto 7);
imaginary_part_mem: mem
PORT MAP(
clka => clk_fft,
wea => wr_en_vec,
addra => ram_adrs,
dina => xk_im_temp,
douta => im_mem_out
);
xn_im <= (others => '0') when (first_transaction = '1')

else im_mem_out(16 downto 7);
fft_inst: xfft
PORT MAP (
clk => clk_fft,
start => start_trans,--start,
fwd_inv => fwd_inv,
fwd_inv_we => fwd_inv_we,
rfd => rfd,
busy => busy_temp,
edone => edone,
done => done_temp(0),
dv => dv_temp,
xn_re => xn_re,
xn_im => xn_im,
xn_index => xn_index,
xk_index => xk_index,
xk_re => xk_re_temp,
xk_im => xk_im_temp);
amplitude_inst: r2p_corproc
GENERIC MAP(
DATA_WIDTH => 17,
PIPE_DEPTH => 15,
PRECISION => 17)
PORT MAP(
clk => clk_fft,
ce => '1',
Xin => SIGNED(xk_re_temp),
Yin => SIGNED(xk_im_temp),
Rout => amplitude_out);
ILA_inst: ILA_CORE
port map(
CONTROL => control_word,
CLK => clk100,
TRIG0 => xn_re,
TRIG1 => xn_im,
TRIG2 => dv_temp_vetor,
TRIG3 => xk_re_temp,
TRIG4 => xk_im_temp,
TRIG5 => STD_LOGIC_VECTOR(amplitude_out),
111
TRIG6 => rst_vector,
TRIG7 => busy_vector,
TRIG8 => done_temp,
TRIG9 => start_vector );
amplitude <= STD_LOGIC_VECTOR(amplitude_out);
ICON_inst: ICON_CORE
port map(
CONTROL0 => control_word,
CONTROL1 => contro2_word);
VIO_inst: VIO_core
port map(
CONTROL => contro2_word,
CLK => clk100,
SYNC_IN => ("0000000"&dv_temp),
SYNC_OUT => open);
dv_temp_vetor(0) <= dv_temp;

busy_vector(0) <= busy_temp;
rst_vector(0) <= rst;
clk_fft_vector(0) <= clk_fft;
busy <= busy_temp;
dv <= dv_temp;
xk_re <= xk_re_temp;
xk_im <= xk_im_temp;
done <= done_temp(0);
end spectogram_arch;
112
113

Audio/Image Processing Using 2D FFT

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Audio/Image Processing Using 2D FFT

Uploaded by

Copyright:

Available Formats

CALIFORNIA STATE UNIVERSITY NORTHRIDGE

Audio/Image Processing in Frequency Domain Using 2D FFT

A graduate project submitted in partial fulfillment of the requirements

For the degree of Master of Science

Dr. Xiyi Hang Date

Dr. Ramin Roosta Date

Dr. Shahnam Mirzaei, Chair Date

California State University, Northridge

toward helping me.

my family, and my beloved husband, Roozbeh

for their constant support and unconditional love.

I love you all dearly.

Signature page ....................................................................Error! Bookmark not defined.

Acknowledgement ............................................................................................................. iii

List of Figures .................................................................................................................. viii

Abstract ............................................................................................................................ xiii

Fourier Transform, Fast Fourier Transform and their applications .................................... 1

Introduction to Fourier Transform .................................................................................. 1

Cooley-Tukey FFT Algorithm ........................................................................................ 4

FFT in Image Processing ................................................................................................ 8

FFT and Spectrogram ................................................................................................... 10

FPGA design process ........................................................................................................ 13

Introduction to FPGA ................................................................................................... 13

FPGA vs. ASIC ............................................................................................................. 14

FPGA Architecture ....................................................................................................... 16

FPGA design process .................................................................................................... 20

Design entry .................................................................................................................. 22

Test development .......................................................................................................... 22

Behavioral simulation ................................................................................................... 23

Place and route .............................................................................................................. 23

Timing analysis ............................................................................................................. 24

Post-synthesis simulation (timing simulation) .............................................................. 24

Hardware debug and verification .................................................................................. 25

Chipscope Xilinx test and debug tool ............................................................................... 26

An introduction to chipscope ........................................................................................ 26

Chipscope structure ....................................................................................................... 27

ILA core ........................................................................................................................ 27

ICON core ..................................................................................................................... 28

Spectrogram implementation using Matlab and FPGA .................................................... 31

spectrogram system implementation using Matlab....................................................... 31

FPGA based spectrogram system ................................................................................. 40

FFT Core Implementation in FPGA ............................................................................. 44

Audio processing using implemented spectrogram .......................................................... 51

Spectrogram Audio result analysis ............................................................................... 73

Image processing using 2D FFT Matlab and FPGA ......................................................... 76

2D FFT implementation on FPGA ............................................................................... 80

Figure 2. Fourier Transform of a step function................................................................... 2

Figure 3. Splitting N point DFT to two N/2 point DFTs .................................................... 6

Figure 4. Cooley Tukey splitting for 8 point DFT .............................................................. 7

Figure 5. FFT of an image that has all frequencies............................................................. 9

Figure 6. FFT of an image with Vertical wide stripes ........................................................ 9

Figure 7. FFT of an image with diagonals stripes .............................................................. 9

Figure 8. Spectrograms of a Wyle’s scream call .............................................................. 11

Figure 9. Spectrogram of a Wyle’s Moan Call ................................................................. 11

Figure 10. 3D surface spectrogram of a piece of music ................................................... 12

Figure 11. FPGA architecture ........................................................................................... 17

Figure 12. Programmable Interconnect details ................................................................. 18

Figure 13. A basic CLB structure ..................................................................................... 18

Figure 14. Logic Cell structure ......................................................................................... 19

Figure 15. FPGA design process ...................................................................................... 21

to input signal .................................................................................................................... 34

Figure 22. Spectrogram of Blueatlx Wale sound .............................................................. 37