Report

Abstract
The sigma-delta modulator based closed loop systems make high resolution, high SNR, low frequency systems. The sigma-delta analog to digital converter consists of the modulator followed by the decimation filter. In this project the design of decimation filter, which performs the role of filtering the shaped quantization noise and converting 1-bit data stream into 20 bit high-resolution output is reported. An efficient multi-stage decimation methodology is adapted where decimation is performed in several stages due to high order of the decimation filter which is almost impossible to implement in hardware. The multi-stage structure consists of Cascaded Integrator Comb (CIC) filter followed by FIR. The specifications of decimation filter are derived from the specifications of a third-order single bit sigma-delta modulator. Use of cascaded integrated comb filter for the first stage has made the implementation easy by requiring no multiplication. Furthermore, it can be used to decimate the data by a large factor, allowing easier implementation of the following stages. Distributed arithmetic algorithm is used to design FIR filters. Software model for the decimation filter is developed using MATLAB /Simulink, and integrated with sigma-delta modulator model to analyze the responses. The hardware model for the filter is developed using Verilog HDL. The design is implemented and tested using SPARTAN 3E FPGA. Filter has got pass band of 100Hz and stopband of 200Hz .The test environment has the feature of taking external input as well as the internally stored bit stream in LUT. The designed system exhibits good linearity and the design consumes a power of 40.4mW.
Table of Contents
Title Page..........i Certificate.ii Abstract................................................................................................................................i Table of Contents...............................................................................................................ii List of Tables.....................................................................................................................iv List of Figures....................................................................................................................iv ............................................................v Nomenclature.....................................................................................................................v Abbreviations.............................................................................................................v 1 Introduction .................................................................................................................1 1.1 Introduction to sigma-delta data converters ......................................................1 1.2 Overview of over-sampling and decimation .....................................................4 1.3 Application of sigma-delta data converter ........................................................4 1.4 Advantages and disadvantages ..........................................................................5 1.5 Summary ...........................................................................................................5 2 Literature Review.........................................................................................................6 2.1 Introduction ......................................................................................................6 2.2 Literature review on decimation filter and distributed arithmetic ....................6 2.2.1 Understanding of cascaded integrator-comb filter ..................................7 2.2.2 Understanding of finite impulse response filter ....................................9 2.2.3 Understanding of distributed arithmetic ...............................................10 2.2.4 FIR Filter with large number of coefficients.........................................11 2.2.5 Signed distributed arithmetic example ..................................................12 2.2.6 Journals/transactions referred................................................................14 2.3 Summary of literature review ..........................................................................17
ii
2.4 Design specification and block diagram ........................................................18 2.5 First stage cascaded comb filter ......................................................................20 2.6 FIR filtering .....................................................................................................20 2.7 Summary .........................................................................................................20 3 Problem Definition.....................................................................................................21 3.1 Problem definition..........................................................................................21 3.3 Problem statement..........................................................................................22 3.3 Project objectives...........................................................................................22 3.4 Methodology adopted to meet the objectives.................................................22 4 Hardware Modelling and FPGA Implementation of FIR Filter. . .23 4.1 Design of hardware model.............................................................................24 4.2 Simulation of the hardware model.................................................................26 5FPGA Implementation of FIR Filter..........................................................................29 6 Conclusions and Recommendations for future work.............................................38 Conclusion..............................................................................................................38 Recommendations for further work.......................................................................40 References.........................................................................................................................41
iii
List of Tables
Table 2.1: Distributed arithmetic table..........................................................................12 Table 2.2: Distributed arithmetic calculation................................................................13 Table 2.3: Literature review a table summary..............................................................14 Table 2.4: Specification of sigma-delta data converter.................................................18 Table 2.5: Over all characteristics of the decimation filter..........................................18 ...........................................................................................................................................19 Table 2.6: Cascaded integrator comb filter..............................................................20 Table 2.7: Requirements of FIR filtering.......................................................................20 .......................................................................................................................................20
List of Figures
iv
Nomenclature
Symbol MHz KHz mA V A mW W nW s ns ms Unit 106 Hertz 103 Hertz 10-3 Ampere v 10-6 Ampere 10-3 Watts 10-6 Watts 10-9 Watts s 10-9 seconds 10-3 seconds Description Frequency Frequency Current Voltage Micron Power Power Power Second Second Second
Abbreviations
ADC ASIC BIST CIC CMOS CSD Analog to Digital Converter Application Specific Integrated Circuit Built In Self Test Cascaded-integrated comb Complimentary Metal Oxide Semiconductor Canonic sign digit v
DA DAC DSP FIR FPGA HDL ISE IC LUT MSB MEMS NTF OSR SNR STD VLSI
Distributed Arithmetic Digital to analog converter Digital signal processing Finite Impulse Response Field Programmable Gate Array Hardware Description Language Integrated Software Environment Integrated circuit Look Up Table Most Significant Bit Micro Electro Mechanical Systems Noise Transfer Function Over Sampling Ratio Signal to Noise Ratio Standard Very Large Scale Integration
vi
1 Introduction
The performance of digital signal processing and communication systems is generally limited by the precision of the digital input signal, which is achieved at the interface between analog and digital information. As the speed and capability of the DSP cores increases, must the speed and accuracy of the converters associated with them also should increase. SigmaDelta modulation based analog-to digital conversion technology is a cost effective alternative for high-resolution converters (more than 12 bits), which can be ultimately integrated on digital signal processor ICs. 1.1 Introduction to sigma-delta data converters The basic concepts of Delta modulators and Sigma-delta converters are the use of feedback for improving the effective resolution of a quantizer. The concept came up in 1954 and the patent was granted in 1960 to cutler. His system was based on generating and subtracting from the input signal the quantization error of the low-resolution quantizer placed in the forward path of a feedback loop. In 1962, it was proposed by the Inose yasuda and Murakami to add the loop filter to front end of the delta modulator and then move it inside the loop. For a Simple case of the integrator used as a loop filter, the resulting system contained an integrator in the forward path, followed by the 1-bit quantizer and the feedback loop contained only a 1-bit digital to analog converter (DAC). Since the system contained a delta modulator and the integrator, they named it as delta-sigma modulator, where sigma denoted summation performed by the integrator. It was often called Sigma-Delta modulator by later works. Today both names are in use. The output of the modulator contains the original input signal plus the first difference of the quantization error, as was the cause of the error feedback coder. Thus the both delta-sigma and error feedback coder are the noise shaping modulator. They suppress the error in the base band and thus achieve improved dynamic range across baseband independent of the signal frequency [1].
The sigma delta converters did not gain importance until the development of digital VLSI technology, which provides the practical means to implement the large digital signal processing circuitry. VLSI has helped to implement both analog and digital circuitry in one die .Since the sigma delta ADC are based on digital filtering techniques, almost 90% of the die is implemented in digital circuitry. The additional advantage of such approach is higher reliability, increased functionality and reduced chip cost [2]. Conventional high-resolution analog to digital converters, such as successive approximation and flash type converters, operating at the Nyquist rate (sampling frequency approximately equal to twice the maximum frequency in the input signal), often do not make use of high speeds achieved with a scaled VLSI technology. These Nyquist samplers require a complicated analog low pass filter (often called an anti-aliasing filter) to limit the maximum frequency input to the analog to digital converters, where as sigma-delta converters use a low resolution analog to digital converter (1-bit quantizer), noise shaping, and a very high over-sampling rate. The high resolution of the Sigma-Delta converters is achieved with help of decimation (sample rate reduction) filters, which are also called digital filters [2]. An Inertial Navigation System use motion sensors like accelerometer to continuously calculate the position and velocity of a moving object without the need for external references. Inertial-navigation systems are used in many different moving objects, including vehicles, aircraft, submarines, spacecraft, and guided missiles. In all these applications accelerometer measures changes in a differential capacitance which result from acceleration input (change in capacitance is directly proportional to acceleration and MEMS sensor output is directly proportional to change in capacitance). This capacitance is converted in to voltage using readout electronics (continuous-time capacitance to voltage converter) which is made up of op-amp based design and feed to 3rd order sigma-delta modulator whose one bit output is used for forced feedback. The digital section of the data converter is used to convert 1-bit data stream into 20 bit data which is given to Inertial Navigation System as sensor output. This project reports about design of Digital section of sigma-delta data converter. The block
diagram of a Simple sigma-delta converter with accelerometer application is shown in Figure 1.1. The MEMS sensor has got bandwidth of 100Hz. Because this the system which is following the sensor electronics (modulator) should have the band width of 100Hz and has the feature of shaping the frequencies above the bandwidth(noise shaping).
Figure 1.1: Block diagram of sigma-delta data converters with accelerometer application All the specification required for the decimation filter is derived from the specification of sigma-delta modulator as well as from the requirement of inertial navigation system. The 20bit digital output from the decimation filter is given to computer of inertial navigational system which computes its on updated position by integrating information received from the motion sensors.
1.2 Overview of over-sampling and decimation An over-sampling converter uses an over-sampling rate of Fs =N*Fs followed by a digital-domain decimation process to compute a more precise estimate for the analog input at the lower output sampling rate (Fs), which is the same as used by the Nyquist samplers. Regardless of the quantization process, over-sampling has immediate benefits for the antialiasing filter. Over-sampling in converters prevents the filters from having steep transition band, which will help for implementation of the filter. The decimation process can be used to provide increased resolution [2]. 1.3 Application of sigma-delta data converter The largest application of the delta-sigma conversion is in the field of digital telephony. Digital cellular telephones utilize delta-sigma both for voice band speech coding and for IF-to-base band radio interface data conversion, such as quadrature and in-phase modulation /demodulation. One thing that takes the full advantages of the inherent qualities of delta-sigma conversion is digital audio. Few of the other application of sigma-delta modulator are listed below.
Accelerometer Pressure measurement Weigh scales Thermocouple Measurements Thermistor Measurements Portable instrumentation Sensor measurement Temperature measurement
This particular project is aimed at Accelerometer application, which requires measuring 1g in 1g range (1g resolution in 1g range). To meet this particular specification it is necessary to design a sigma-delta data converter which has 120dB SNR. So decimation filter is also designed with the specification of 120dB attenuation in the stop-band and with a minimum ripple of 0.0005 dB in pass-band
1.4 Advantages and disadvantages High resolution above 12 bit can be achieved Without CMOS the digital Decimation and filtering is too expensive 1.5 Summary Sigma-Delta data converters are used to get high resolution (greater than 12 bits) digital outputs which can be ultimately integrated on digital signal processor ICs. The purpose of the modulator is to remove the noise from base band and to perform noise shaping. Filtering noise which is aliased back in to base-band is the main function of the digital filtering stage and its secondary function is to convert 1-bit data stream that has a high sample rate and transform it in to a 20 bit data stream at lower sample rate. This documentation is organized with a total of 6 chapters. Chapter 1 gives a brief introduction to Sigma-delta data converter and its application. Chapter 2 gives the literature review carried on decimation filter and distributed arithmetic algorithm for project. Chapter 3 contains the problem definition, objectives and methodologies adapted to accomplish the set of objectives. Chapter 4 describes about the software (MATLAB) modelling of the decimation filter and simulation results. Chapter 5 describes about the hardware modelling and testing of decimation filter in FPGA. Chapter 6 gives the conclusion and future work that can to be carried out in decimation filter.
2 Literature Review
2.1 Introduction The decimation and filtering can be performed using different architectures, but it is very much important to know which is the best one and can be implemented in hardware using less resources. It is believed that the best architecture is the one which can be accommodated in hardware consuming less area and power. 2.2 Literature review on decimation filter and distributed arithmetic Practically it is not possible to implement a single filter that would meet the characteristic of decimation filter, because order of such filters would be close to 5000. It is impossible to implement such filters in hardware. So it is necessary to split the architecture of decimation filter in to two parts [3], CIC filter followed by FIR stage as shown Figure 2.1.
Figure 2.1: Architecture of decimation filter The output of sigma-delta modulator is at the rate of 62.5 KHz and because of this reason first stage of decimation filter (CIC) reads the input at the rate 62.5 KHz and performs a decimation of 25 and obtains an intermediate frequency of 2500Hz. The decimation factor of CIC filter is selected 25 based on following factors Droop in the frequency response of the filter at the edge of the signal band
Order of the FIR stages that follows the CIC filter Based on the intermediate frequency and order of the filter, decimation factor for the stage FIR filter is fixed to 4 and the output sampling frequency of the filter is 625Hz. Since the resonance frequency of the inertial navigation system is 1 KHz. Because of this reason output of the decimation FIR filter will be at the rate of 625Hz. The FIR filters are operation with a clock frequency of 2 MHz so that the filter finishes its computation before the arrival of next input. 2.2.1 Understanding of cascaded integrator-comb filter Cascaded integrator-comb (CIC) digital filters were introduced to the signalprocessing community, by Eugene Hogenauer [4], and mainly used in efficient implementation of decimation and interpolation. Bonus in using CIC filters, and a characteristic that makes them popular in hardware devices, is that they require no multiplication. The arithmetic needed to implement these digital filters is strictly additions and subtractions only. CIC filters are known in the field of electronics with different names, like moving average filter or recursive filter. The two basic building blocks of a CIC filter are integrator and comb. An integrator is just a single pole IIR filter with a unity feedback coefficient. This system is also called as accumulator. For decimators, the gain G at the output of the final comb section is G = (RM) N --------------------------- (1) Where R is rate of change and M is differential delay Assuming twos component arithmetic, the gain G is used to calculate the number of bits required for the last comb due to bit growth. If Bin is the number of input bits, then the number of output bits, Bout is
Bit growth (Max) = [Nlog2RM+Bin] ------------------------ (2) N COMB Delay R = Number of stages =M = Decimation factor
Input word length = Bin
It is also the bit width required for each stage of integrator and comb. Hardware implementation of Cascaded integrator-comb filter Decimation to a lower sampling rate is achieved by taking one sample out of every D sample.
X (n)
+ +
Integrator +
D
Decimator
Comb section
+ +
y(n)
Z-1
Z-1
Figure 2.2: Architecture of single CIC filter [4] The Figure 2.2 shows the hardware implementation architecture for single stage CIC filter, as the number of stages increases number of integrator and the comb section also increases. The order of the decimation filter is fixed to six because it has to be one more that the order of the 3rd order sigma-delta modulator plus the 2nd order MEMS sensor. When CIC filter is build six integrator section is cascaded together with six comb section and for the
hardware reduction it is better to have decimation section between integrator and comb section. 2.2.2 Understanding of finite impulse response filter Finite Impulse Response (FIR) Filter is one of the primary types of filters used in digital signal processing application. Since the filter does not use feedback the impulse response of the filter is finite. Some of the advantages of FIR filter are listed below [5] They are suited to multi-rate application Easily designed to be linear phase Simple to implement in hardware The three most important popular method used for designing FIR filter are ParksMcClellan method, widowing and direct calculation [5]. Parks-McClellan: - it is most widely used FIR filter design method. It is an iteration algorithm that accepts filter specifications in terms of passband and stopband frequencies. We can specify all the important filter parameters in this method, this made this technique popular. Two stage FIR filters for the decimation filters are also designed using this algorithm. Windowing: - In the windowing method, an initial impulse response is derived by taking the Inverse Discrete Fourier Transform of the desired frequency response. Then, the impulse response is refined by applying a data window to it. Direct Calculation: - The impulse response for the certain type of filters can be calculated directly from formulas.
Figure 2.3: Conventional tapped delay line FIR filter [5] Figure 2.3 shows the conventional tapped delay line realization of this inner product calculation. Even though diagram gives us an idea about the mechanism of FIR filter actual FPGA implementation cannot be realized from this. Two most common way of implementing FIR filter in hardware are Multiply and accumulate unit Distributed Arithmetic Method Multiply and accumulate unit uses multiplier for implementing the FIR filter in hardware, for a filter with large number of coefficient it is not a recommended way of implementing in hardware. 2.2.3 Understanding of distributed arithmetic Distributed Arithmetic (DA) is a different approach for implementing digital filters. The basic idea is to replace all multiplications and additions by a Table and a shifteraccumulator. DA relies on the fact that the filter coefficients are known, so multiplying c[n]x[n] becomes a multiplication with a constant. This is an important difference and a prerequisite for a DA design. It is a powerful technique for reducing the size of a parallel multiply-accumulate hardware that is well suited to FPGA designs. The block diagram for the DA implementation of FIR filter is shown in Figure 2.4, While implementing DA it is necessary to store the inputs as same as the coefficient length
10
in buffer stage. Once it is done, the LSB of all coefficients is taken as address to LUT .That is, a 2n word LUT is preprogrammed to accept an N-bit address, where N is number of coefficients. Individual mappings are weighted by the appropriate power of two factors and accumulated. The accumulation is efficiently implemented using a shift-adder as shown in Figure 2.4. For hardware implementation, instead of shifting each intermediate value by power factor which requires an expensive barrel shifter, shift the accumulator content itself in each direction one bit to the right [5].
Bit shift register

X B-1 [0] ... ... X 1 [0] X 1 [1] X 0 [0] X 0 [1]
Arit Tabl . h e
LUT
Buffering\ decimating
Scaling Accumulator
X B-1 [1]
Serial input
. . .
X B-1[N-1]
...
X 1[N -1]
. . .
. . .
X 0[N-1]
+/
2-1
Add/Sub Bit shifter used as address generator Shift and adder
Figure 2.4: Block diagram of distributed arithmetic [5]
2.2.4 FIR Filter with large number of coefficients The number of words in DA LUT is 2ntaps which exponentially increases with n-taps. A LUT splitting method effectively reduces the memory usage. While implementing the design for decimation filter using distributed arithmetic it is not required to use poly phase structure, because it doesnt bring any benefit.
r Register
11
Buffer
Figure 2.5: Block diagram of distributed arithmetic with split LUT [5] Figure 2.5 shows the block diagram of distributed arithmetic algorithm which can be used to implement a filter with higher order or coefficients is large to implement with higher order. Here it is better to use parallel tables and add the results. By using pipeline registers this modification will not reduce the speed of design, where dramatically reduces the area, because size of the LUT grows exponentially with the address space. 2.2.5 Signed distributed arithmetic example Assume that the data is given in (N = 5bit) twos compliment encoding and the coefficients are c[0] = 2,c[1]=3, c[2]=1, c[3]=4 , the corresponding LUT is shown on Table 2.1.
Table 2.1: Distributed arithmetic table
12
Sl:No 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
xc [3]
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
xc [2]
0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
xc [1]
0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
xc [0]
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
f(c[k],x[n]) =
4x0+1x0+3x0+2x0 4x0+1x0+3x0+2x1 4x0+1x0+3x1+2x0 4x0+1x0+3x1+2x1 4x0+1x1+3x0+2x0 4x0+1x1+3x0+2x1 4x0+1x1+3x1+2x0 4x0+1x1+3x1+2x1 4x1+1x0+3x0+2x0 4x1+1x0+3x0+2x1 4x1+1x0+3x1+2x0 4x1+1x0+3x1+2x1 4x1+1x1+3x0+2x0 4x1+1x1+3x0+2x1 4x1+1x1+3x1+2x0 4x1+1x1+3x1+2x1
Sum 0 2 3 5 1 3 4 6 4 6 7 9 5 7 8 10
Values of x[n] as, x[0]=1 x[1]=3, x[2]=7, x[3]=15 .The output at sample index k, namely y, is defined as shown in Table 2.2 Table 2.2: Distributed arithmetic calculation
step(t) 0 1 2 3 4
xt [3]
1 1 1 1
xt [2]
1 1 1 0
xt [1]
1 1 0 0
xt [0]
1 0 0 0
f[t] x 2t +Y[t-1]] =
10x2 + 0 8x21 + 10 5x22 +26 4x23 +46 - f[t]
0
Y[t] 10 26 46 78 -50
xt [3]
1
xt [2]
1
xt [1]
1
xt [0]
0
x 2t +Y[t-1]]
-8x24 +78
The numerical check results c[0] x[0] + c[1] x[1]+ c[2] x[2] + c[3] x[3] + c[4] x[4] = x 2 +(-13) x 3 +(-9) x 1 + (-1) x (4) = -50
13
2.2.6 Journals/transactions referred Table 2.3: Literature review a table summary

Year 2006 Title & Journal Area-Effcient FIR Filter Design on FPGAs using Distributed Arithmatic [5] Modeling and design of novel architecture of multibit switchedcapacitor sigma-delta converter with two-step quantization process [3] Multistage Decimation Filter Design Technique for High-Resolution Sigma-Delta A/D Converters [6] Authors Patick Longa and Ali Miri Specifications 4-LUT based FPGA implementation of DAFIR Merits /Demerits Less area, modified accumulator stage to save area
2005
L. Fujcikl, A. S. Kuncheva, T. Mougel, and R. Vrbal
Signal bandwidth 100Khz Sampling frequency 16Mhz,OSR 64 Bit resolution 16 3rd order modulator, SNR 120 dB 18-20 Bit resolution
1992
Sangil Park,
FPGA is used for implementation. All output values are not computed due to decimation. Uses symmetry of the coefficients This architecture uses 6stage half band filter, Increasing resolution of conventional filters with FIR filters
Table 2.3 shows gives a quick summary of papers that is been referred for this project. Even though Patick and Ali [5] discuss about area efficient FPGA implementation, they talk only about lower order filter, which will not help in implementing higher order filters. Whereas, Fujicikl [3] discuss about decimation filter implementation using multiplier for higher frequency system. Sangil [6] discuss about implementing decimation filter using halfband filters which will increase the area in FPGA implementation.
14
2.2.7 Important points from journals/transactions/books referred Lukas Fujcik [3] presents the practically possible architecture for decimation filter, with multistage approach. The proposed architecture consists of three stages one Cascaded Integration Comb filter followed by two (FIR) filters. Decimation filter specification is derived from the modulator specification. It is understood from the paper that quantization of filter coefficients reduces the filter performance and when using 24 bit for the quantization the response of the quantized filter matches that of the reference filter from MATLAB. This paper takes of FPGA implementation with multipliers. Eugene B. Hogenauer [4] presents a class of digital linear phase finite impulse response (FIR) filters for decimation (sampling rate decrease) and interpolation. It is very efficient to use a cascaded Comb filter because it can be easily implemented in hardware requiring no multipliers. Patrick Longa [7] presents an area-efficient FIR Filter design on FPGAs using Distributed Arithmetic. He compares with different way of implementing the FIR filters in FPGA. While implementing with multipliers it takes more area and is critical for higher order filters. Many multiplier less techniques were compared, like canonic sign digit (CSD), Dempster-Mclood method, use of memories and Look-Up tables to store pre-computed values of coefficient operations (memory based method). He tells Distributed arithmetic as the very efficient solution for FIR implementation because this algorithm on FPGA permits everything from bit-serial implementations to pipelined or full parallel versions of this scheme, which can greatly improve the design performance. Richard and Gabor [8] presents technical literature on sigma-delta modulator, they also talk about noise shaping which is central to sigma-delta modulator. The main points that are observed from his book are
15
The modulator output can be decimated by the factor of OSR without the loss of information since the minimum sample rate at the output of the decimation filter is 2fB, where fB frequency band width. The procedure of analyzing data using the fast Fourier Transform F. This step is necessary to estimate the power spectral density of the data. The filter has to cut off at faster rate around fB than the NTF of the the modulator rise, this is to ensure very little out-of-band noise is left unsuppressed around f=f B after decimation. The gain response should be flat around fs/OSR, this guaranties that the folding noise of the noise from frequency bands around fs/OSR, 2fs/OSR etc, after decimation adds little to the in-band noise. Steven [1] presents basic concept of understanding of modulator and decimation in his book. Some of the very important concepts that are absorbed from his book are listed below To get a SNR of 120 dB ,it is necessary to have an Over Sampling Ration (OSR) above 256 for sigma-delta modulator Better performance can be obtained by using the filter that is represented by a product of sinc function. Order of the filter should be one more than the order of the modulator and the penalty for using this class of decimation is typically less than a 0.5-dB increase in noise. Droop in the frequency response of the filter at the edge of the signal band. As the frequency, ratio decreases droop increases The aliasing distortion is the most aliasing distortion cannot be repaired Jacab Baker [9] presents the art of designing decimating filters and implementation in hardware, some of the important points that he has discussed a) It is highly desirable to design a CIC or sinc filter without droop dominant design parameter because amplitude distortion and noise penalty can be repaired by following FIR stages, where as
16
b) Sinc filter is ideal for removing the modulation noise c) Sinc filter response is not monotonic d) For single stage CIC difference between the main lobe and first side lobe is around 13.5dB and need to cascade averaging filters to get large amount of attenuation at frequencies above fs/K, where K is the decimation factor. Uwe Meyer-Baese [5] presents the technique of implementing CIC filter and FIR filter in FPGA some of the important points discussed by him are a) A three stage CIC has three integrator, one decimation part and three-comb section. b) Max-Bit growth in the CIC can be calculated and while implanting this has to be used to avoid over flow. c) Since the comb section is twos compliment system, it will automatically compensate for the integrator over flow. d) Amplitude distortion in CIC can be corrected in the cascaded FIR compensation filter, but the aliasing distortion cannot be repaired.
2.3 Summary of literature review It is impossible to implement FIR filter as a single filter in hardware The proposed architecture consists of two stages one Cascaded Integration Comb filter followed by FIR filter. To get a SNR of 120 dB ,it is necessary to have an OSR above 256 Better performance can be obtained by using the filter that is represented by a product of sinc function. Order of the filter should be one more than the order of the modulator and the penalty for using this class of decimation is typically less than a 0.5 dB increase in noise.
17
Droop in the frequency response of the filter at the edge of the signal band. As the frequency, ratio decreases droop increases The aliasing distortion is the most dominant design parameter because amplitude distortion and noise penalty can be repaired by following FIR stages, where as aliasing distortion cannot be repaired Explains the implementation of Distributed arithmetic in FPGA, requirement of memory increases with increase in order (b) of the filter (2b), if memory based implementation is used.
2.4 Design specification and block diagram The specifications of the decimation filter are dependent upon the overall specification from the sigma-delta converter. Table 2.4: Specification of sigma-delta data converter Parameter Signal Band width Sampling Frequency Over sampling Ratio for 200Hz Number of bits in modulator bit stream Number of bit in output of filter Symbol BW Fs OSR Bmod Bout Value 100Hz 62.5KHz 312 1 20
Table 2.4 shows the overall specification of Delta-Sigma converter. To get an SNR of 120dB, it needs OSR of 256 and above. Based on available crystal frequency (4 MHz) sampling frequency of modulator is fixed as 62.5 KHz. The overall characteristics of the decimation filter is as given in Table 2.5 Table 2.5: Over all characteristics of the decimation filter Parameter Symbol Value
18
Pass band Stop band Sampling Frequency Pass band ripple Stop band ripple Decimation factor Down-sampling Frequency
Fp Fstop Fs Apass Astop Df Fds
100Hz 200Hz 62.5KHz 0.00005dB 120dB 100 625Hz
The Block diagram of the Decimation Filter based on literature survey and Design specification of the Sigma-delta converter is shown in Figure 2.5
Figure 2.6: Block diagram of decimation FIR filter Selecting 62.5 KHz as the sampling frequency of modulator is because of the reason that we can directly derive this frequency from 4 MHz crystal. The sigma-delta modulator produces a single bit stream at the rate of 62.5 KHz and because of this reason first stage of decimation filter (CIC) reads the input at the rate 62.5 KHz and performs a decimation of 25 and obtains an intermediate frequency of 2500Hz. Based on the intermediate frequency and order of the filter, decimation factor for first stage FIR filter is fixed to 4 and the output sampling frequency of the filter is 625Hz. Since the resonance frequency of the inertial navigation system is 1 KHz. Because of this reason output of the decimation FIR filter will be at the rate of 625Hz.
19
2.5 First stage cascaded comb filter The first stage CIC is easy to implement in hardware, requiring no multiplications and it can be used to decimate the data by a large factor. One drawback of this filter is droop in the pass-band due to sin(x)/x response of the filter. The specification of the CIC filter is shown in Table 2.6. Decimation factor for the CIC filter is fixed 25 based on the requirement of the intermediate frequency and order is one more that the sum of orders of sigma-delta modulator and MEMS accelerometer sensor. Table 2.6: Cascaded integrator comb filter Parameter Decimation factor Number of sections Differential Delay Fs 2.6 FIR filtering Table 2.7 shows the FIR requirements of the filter. Table 2.7: Requirements of FIR filtering Filter parameter Sampling frequency (Fs) Pass band frequency (Fpass) Stop band frequency(Fstop) Pass band Max ripple (Apass) Stop band (Astop) Order of the filter 2.7 Summary Decimation filter takes the 1-bit data stream that has a high sample rate and transforms it in to a 20-bit data stream at a lower sampling rate. The specification of the decimation filter is delivered from specification of modulator section and application. It is FIR filter 2500 Hz 100Hz 440Hz 0.00025 dB Value 25 6 1 62.5Khz
attenuation 120 dB 47
20
not possible to implement the decimation filter as a single filter in hardware due to order of the filter, so two stage architecture is selected (CIC followed by FIR stage) to implement in hardware. The advantage of CIC filter is that it is easy to implement in hardware because it does not require any multiplication. The FIR filter is implemented with the concept of distributed arithmetic which performs well in FPGA implementation. Chapter 3 discusses about problem definition, project objectives and methodology followed for execution of this project.
3 Problem Definition
3.1 Problem definition Because of the use of over sampling in sigma-delta modulators, the need arises for changing the sampling rate of signal, decreasing it to Nyquist rate. Thus, the high resolution can be achieved by the decimation (sample reduction). Such sample reduction can be achieved employing high precision FIR filters. Thus design and implementation of
21
decimation filter for sigma delta modulator as per the specification is very important for the performance of data converters. 3.3 Problem statement To design and implement FIR filter using distributed arithmetic algorithm for SigmaDelta Modulator on FPGA 3.3 Project objectives To review literature on modulator architectures for decimation FIR filters for sigma-delta
To arrive at design specifications of the DA-FIR filter based on sigma-delta modulator To develop a MATLAB model of the decimation FIR filter based upon derived specifications To model and simulate the hardware for DA-FIR filter To implement and verify the Decimation FIR filter on FPGA 3.4 Methodology adopted to meet the objectives Literature review on application of FIR Filter in Sigma-Delta modulators is carried out by referring journals, books, manuals and related documents Literature review on FIR filter designs using distributed arithmetic algorithm (DAFIR) is carried out by referring journals, books, manuals and related documents Design specifications for DA-FIR filter is arrived at based on application and reviewed literature Suitable architecture for DA-FIR filter is identified as per the specifications and reviewed literature
22
Reference model for decimation-FIR filter is developed based upon identified specifications using MATLAB Hardware for DA-FIR filter is modeled in HDL and simulated using ModelSim HDL test bench is developed ModelSim The hardware model is optimized to meet the design specifications The HDL model is synthesized using Xilinx-ISE A test environment is developed for verifying decimation FIR filter on FPGA The decimation-FIR filter is implemented on FPGA and the outputs is verified with ModelSim results for verifying DA-FIR filter and simulated using
4 Hardware Modelling and FPGA Implementation of FIR Filter
The Hardware model of the FIR filter is designed using Xilinx-ISE and simulations where performed using ModelSim. The hardware model requires algorithm which can be implemented in hardware form. The following section discuss about designing, simulation and testing of hardware model of decimation FIR filter for FPGA.
23
4.1 Design of hardware model The design of the Hardware model for decimation filter is subjected to FPGA and sigma-delta modulator specification. Following are the some of the important points that are considered while designing the hardware model or block. Algorithm selected should consume less area and should fit in targeted FPGA Selected algorithm should use all power reducing technique , so that design consumes less power in FPGA Algorithm should meet all the specifications of Decimation FIR filter (Digital section of Sigma-Delta converter) Hardware model architecture for decimation filter The hardware model for decimation FIR filter consists of following blocks Clock divider Cascaded COMB filter FIR filter
Figure 4.1: Hardware model architecture of FIR filter The Figure 4.1 shows the hardware model architecture of decimation FIR filter; following are the port requirements of hardware model. INPUTS: Clk 4MHz form crystal oscillator Rst asynchronous Rest signal
24
OUTPUTS:
Xin- Output of the sigma-delta modulator at the rate of OSR clk FIR1_yout 20-bit digital output FIR1_out -Output indicator OSR clk - OSR clk output for modulator section
The system clock of the design is 4MHz; the clock divider module performs the action of dividing the clock in to OSR clk of 62.5 KHz and another clock of 2MHz for the distributed arithmetic algorithm (FIR modules). The clock divider module uses simple principle of counter to get the clock divided. This approach is simple and effective way of dividing the clock. The CIC module works at lower clock 62.5 KHz, or in other words every 16ns a new data is read in to CIC module. For the decimation purpose of the CIC, an internal counter generates a clock or pulse signal equal to the decimation factor of the filter which is acting as clock to comb section and at the negative edge of this signal data is transferred from integrator to comb section. Same decimated clock is used to transfer data from CIC module to FIR1module. The data transfer from CIC to FIR1 module occurs with the help of handshake signal, because of this technique there will not be any loose of data while transferring one module to next. An active high asynchronous reset is used to clear the content of the flip flop. 8 bit address generator
25
Figure 4.1: Distributed arithmetic architecture used in hardware modeling Figure 4.1 shows the distributed arithmetic architecture which is used in hardware modeling. Buffer stores the input as that of number of coefficients at the rate of 2500 Hz. For the action of decimation the buffer skips the input as same as that of the decimation Avoids Barrel factor. Since the number of coefficients is large, FIR filter is divided in to 8 equal shifter filters for easy hardware implementation. In each stage bit shift register provides the address to LUT. The output of the LUT is then scaled by weight of MSB and the sum from the accumulator gets divided by 2. When the address reaches MSB of all the coefficient, the output of the LUT gets subtracted in accumulator. Final value from all the sub-DA gets added up and given as output. The logic of distributed arithmetic algorithm operates at 2MHz; it is because of the reason that before the arrival of next computational command previous operation has to be finished. While implementing distributed arithmetic in hardware it was observed that more the DA sub module larger the area consumed irrespective of number of coefficients. But FPGA has the limitation of forming the logic for more that 16 coefficients. 4.2 Simulation of the hardware model The functionality of the hardware model is verified by simulating the model using ModelSim .The hardware model of decimation FIR filter consists of 3 major sub-modules.
26
Cascaded integrator-COMB filter FIR filter
Figure 4.2: Test environment for testing hardware model Figure 4.2 shows the test environment created for the hardware modeling of the decimation filter. The test cases are generated through test bench it is given to hardware model. Output of the hardware model is observed through simulation window. The tool used for simulation of decimation filter is ModelSim. Each sub-model is tested using separate test benches for verifying the functionality.
The method by which a FIR filter can be tested is using Sine Test: Input a sine wave bit stream and check whether expected output amplitude and frequency of the wave is attained. For decimation filter sine test is performed by giving bit stream from modulator, Figure 4.5 shows the sine test that is performed on design
27
Figure 4.5: Sine test simulation wave form of decimation FIR filter Figure 4.5 shows the simulation results of decimation FIR filter for a 12.5Hz signal at the modulator input. It is observed that output of CIC filter is 12.5Hz sine wave with higher sampling rate (2.5 KHz) and FIR filter has got the sampling frequency of 625Hz.
28
5FPGA Implementation of FIR Filter 5.1 Introduction to FPGA FPGA contains a two dimensional arrays of logic blocks and interconnections between logic blocks. Both the logic blocks and interconnects are programmable. Logic blocks are programmed to implement a desired function and the interconnections are programmed using the switch boxes to connect the logic blocks. To be more clear, if we want to implement a complex design (CPU for instance), then the design is divided into small sub functions and each sub function is implemented using one logic block. Now, to get our desired design (CPU), all the sub functions implemented in logic blocks must be connected and this is done by programming the internal structure of an FPGA which is depicted in the following figure 5.1.
29
Figure 5.1: FPGA interconnections FPGAs, alternative to the custom ICs, can be used to implement an entire System On one Chip (SOC). The main advantage of FPGA is ability to reprogram. User can reprogram an FPGA to implement a design and this is done after the FPGA is manufactured. This brings the name Field Programmable. Custom ICs are expensive and takes long time to design so they are useful when produced in bulk amounts. But FPGAs are easy to implement within a short time with the help of Computer Aided Designing (CAD) tools (because there is no physical layout process, no mask making, and no IC manufacturing). Some disadvantages of FPGAs are, they are slow compared to custom ICs as they cant handle vary complex designs and also they draw more power. Xilinx logic block consists of one Look Up Table (LUT) and one Flip-Flop. An LUT is used to implement number of different functionality. The input lines to the logic block go
30
into the LUT and enable it. The output of the LUT gives the result of the logic function that it implements and the output of logic block is registered or unregistered output from the LUT. SRAM is used to implement a LUT.A k-input logic function is implemented using 2^k * 1 size SRAM. Number of different possible functions for k input LUT is 2^2^k. Advantage of such an architecture is that it supports implementation of so many logic functions, however the disadvantage is unusually large number of memory cells required to implement such a logic block in case number of inputs is large.
Figure 5.2 shows a 4-input LUT based implementation of logic block LUT based design provides for better logic block utilization. A k-input LUT based logic block can be implemented in number of different ways with tradeoff between performance and logic density. An n-LUT can be shown as a direct implementation of a function truth-table. Each of the latch holds the value of the function corresponding to one input combination. For Example: 2-LUT can be used to implement 16 types of functions like AND, OR, A +not B.... Etc. Interconnects A wire segment can be described as two end points of an interconnection with no programmable switch between them. A sequence of one or more wire segments in an FPGA can be termed as a track. Typically an FPGA has logic blocks, interconnects and switch blocks (Input /Output blocks). Switch blocks lie in the periphery of logic blocks and interconnect. Wire segments
31
are connected to logic blocks through switch blocks. Depending on the required design, one logic block is connected to another and so on. 5.2 FPGA DESIGN FLOW In this part of tutorial we are going to have a short intro on FPGA design flow. A simplified version of design flow is given in the flowing diagram.
Figure 5.3 FPGA Design Flow 5.2.1 Design Entry There are different techniques for design entry. Schematic based, Hardware Description Language and combination of both etc. . Selection of a method depends on the design and designer. If the designer wants to deal more with Hardware, then Schematic entry is the better choice. When the design is complex or the designer thinks the design in an algorithmic way then HDL is the better choice. Language based entry is faster but lag in performance and density. HDLs represent a level of abstraction that can isolate the designers from the details of the hardware implementation. Schematic based entry gives designers much more visibility into the hardware. It is the better choice for those who are hardware oriented. Another method but rarely used is state-machines. It is the better choice for the designers who think
32
the design as a series of states. But the tools for state machine entry are limited. In this documentation we are going to deal with the HDL based design entry. 5.2.2 Synthesis The process that translates VHDL/ Verilog code into a device netlist format i.e. a complete circuit with logical elements (gates flip flop, etc) for the design. If the design contains more than one sub designs, ex. to implement a processor, we need a CPU as one design element and RAM as another and so on, then the synthesis process generates netlist for each design element Synthesis process will check code syntax and analyze the hierarchy of the design which ensures that the design is optimized for the design architecture, the designer has selected. The resulting netlist(s) is saved to an NGC (Native Generic Circuit) file (for Xilinx Synthesis Technology (XST)).
Figure 5.4 FPGA Synthesis 5.2.3 Implementation This process consists of a sequence of three steps Translate Map Place and Route Translate: Process combines all the input netlists and constraints to a logic design file. This information is saved as a NGD (Native Generic Database) file. This can be done using NGD
33
Build program. Here, defining constraints is nothing but, assigning the ports in the design to the physical elements (ex. pins, switches, buttons etc) of the targeted device and specifying time requirements of the design. This information is stored in a file named UCF (User Constraints File). Tools used to create or modify the UCF are PACE, Constraint Editor Etc.
Figure 5.5 FPGA Translate Map: Process divides the whole circuit with logical elements into sub blocks such that they can be fit into the FPGA logic blocks. That means map process fits the logic defined by the NGD file into the targeted FPGA elements (Combinational Logic Blocks (CLB), Input Output Blocks (IOB)) and generates an NCD (Native Circuit Description) file which physically represents the design mapped to the components of FPGA. MAP program is used for this purpose.
Figure 5.6 FPGA map Place and Route: PAR program is used for this process. The place and route process places the sub blocks from the map process into logic blocks according to the constraints and connects the
34
logic blocks. Ex. if a sub block is placed in a logic block which is very near to IO pin, then it may save the time but it may affect some other constraint. So tradeoff between all the constraints is taken account by the place and route process. The PAR tool takes the mapped NCD file as input and produces a completely routed NCD file as output. The output NCD file consists of the routing information.
Figure 5.7 FPGA Place and route 5.3 Synthesis Result The developed FIR is simulated and verified their functionality. Once the functional verification is done, the RTL model is taken to the synthesis process using the Xilinx ISE tool. In synthesis process, the RTL model will be converted to the gate level netlist mapped to a specific technology library. Here in this Spartan 3E family, many different devices were available in the Xilinx ISE tool. In order to synthesis this DA-FIR design the device named as XC3S500E has been chosen and the package as FG320 with the device speed such as -4. RTL Schematic The RTL (Register Transfer Logic) can be viewed as black box after synthesize of design is made. It shows the inputs and outputs of the system. By double-clicking on the diagram we can see gates, flip-flops and MUX.
35
. Figure 5.8: FIR Schematic with Basic Inputs and Output Here in the above schematic, that is, in the top level schematic shows all the inputs and final output of FIR design.
Figure 5.9: Blocks inside the Developed Top Level FIR Design The internal blocks available inside the design includes CIC, FIR, LUT which were clearly shown in the above schematic level diagram. Inside each block the gate level circuit will be generated with respect to the modeled HDL code. 5.3.1 FIR Synthesis Result
36
This device utilization includes the following. Logic Utilization Logic Distribution Total Gate count for the Design
Device utilization summary:
37
The device utilization summery is shown above in which its gives the details of number of devices used from the available devices and also represented in %. Hence as the result of the synthesis process, the device utilization in the used device and package is shown above. Timing Summary: Speed Grade: -4 Minimum period: 24.592ns (Maximum Frequency: 40.664MHz) Minimum input arrival time before clock: 11.729ns Maximum output required time after clock: 4.394ns Maximum combinational path delay: No path found
6 Conclusions and Recommendations for future work The Decimation FIR filter is successfully implemented and tested in software as well as in FPGA board. In this particular section conclusion based on software simulation, hardware implementation and recommendation on future work is discussed. Conclusion In this project decimation FIR filter is implemented in hardware using multistage approach, where by decimation is performed in several stages. The most efficient architecture for implementing multistage decimation filter is Cascaded Integrator Comb filter followed by required number of FIR stages. The software reference model of the decimation filter is implemented in MATLAB/Simulink and integrated with modulator model to test the performance of the filter. Simulink model of the modulator gives 1-bit at high sampling rate and first stage of the decimation filter (CIC) lowers the sampling rate and transforms to 20bit. The FIR filter stages are designed using filter design tool, and the word length of the
38
coefficients are kept such that designed filter will meet the specifications as that of reference filter. The software model developed is used in verifying and debugging the Hardware model. The hardware model for decimation filter is developed in Verilog HDL .The FIR filter is designed using distributed arithmetic algorithm which is ideal for an FPGA implementation. The test bench for the hardware model is developed using the bit-stream that is generated from the software model of the modulator and simulated in ModelSim. The FPGA test environment for the design is developed through C program and add-on card. This helps in giving an external input as that of the modulator. The decimation filter has got an inbuilt test program which generates test input for the decimation filter; the limitation with this test program is that bit stream for one cycle of sine wave called repeatedly, which may not happen in modulator. Analysis performed on decimation filter shows it is a low pass filter and it is implemented and tested in FPGA. An attenuation of 101dB is measured in stopband of the filter and design consumes power of 40.39mW. Few of the important points that can be summarized for entire decimation filter are Two Stage Architecture (CIC followed by FIR filter) is selected for Hardware implementation due to high order of the filter MATLAB model of the decimation filter is designed and bit length of the FIR filter is fixed to 24 to obtain required specification of decimation filter Sum of the scaled coefficients can be used to find out the final value of step response of FIR filter More than 16 coefficients cannot be grouped together in distributed arithmetic since it will get synthesized Shift and accumulate Unit consumes more area (60%) compared to Lookup table in FPGA implementation
39
An attenuation of 101dB is measure in stopband of the filter Design consumes a power of 40.1 mW and 26992 cores in FPGA
Recommendations for further work The performance of this project work can be improved in many ways in future work. Few of them are listed below. The software reference model is used only for generating bit stream for test bench and to check the attenuation and spectrum of the signal at various stages of the architecture. This model can be improved to study the behaviour of entire sigma-delta data converter. At present CIC model supports only uni-polar data. It can be enhanced to support bi-polar data. The hardware model for the decimation filter is tested only with the software model of the sigma-delta modulator. The decimation filter needs to be tuned with hardware of sigma-delta modulator.
40
References
1. Steven R.Norsworthy,Richard Schrerier and Garbor C.Temes, Delts-Sigma Data Converters. Theory, Design, and Simulation, IEEE Press, A John Wiley & Sons, INC. Publication, ISBN: 0-7803-1045-4,1997 2. Sangil Park ,Principle of Sigma-Delta Modulation for Analog-to-Digital Converters, APR8/D,Rev.1, Motorola Digital Signal Processors,2000 3. Lukas Fujcik , Thibault Mougel , Jiri Haze and Radimir Vrba , Modeling and design of novel architecture of multibit switched-capacitor sigma-delta converter with twostep quantization process, Proceedings of the International conference on Networking, Systems, Mobile communication and Learning Technologies (ICNICONSMCL06),2006 4. Eugene B.Hogenauer, An Economical Class of Digital Filters for Decimation and Interpolation, IEEE Transactions on Acoustics and Signal processing, VOL, ASSP29, No.2, pp 155-162, 1981 5. Uwe Meyer-Baese, Digital Signal Processing with Field Programmable Gate Arrays, Springer International, ISBN: 81-8128-455-0, 2004 6. Sangil Park , Multistage Decimation Filter Design Technique for High-Resolution Sigma-Delta A/D converters, IEEE Transactions on Instrumentation and Measurement,VOL41,No.6,1991 7. Patrick Longa and Ali Miri, Area-Efficient FIR Filter Design on FPGAs using Distributed Arithmetic, IEEE International Symposium on Signal Processing and Information Technology, pp.248-252, June 2006 8. Richard Schreier and Gabor C.Temes, Understanding Delta-Sigma Data Converters , IEEE Press,A John Wiley & Sons,INC. Publication,1996 9. R.Jacob Backer , CMOS Mixed-Signal Circuit Design , second edition, IEEE Press,A John Wiley & Sons,INC. Publication, ISBN: 0-471-22754-4,2002
41
42

Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Report

Uploaded by

Copyright:

Available Formats

Abstract

Input word length = Bin

Bit shift register

Figure 2.4: Block diagram of distributed arithmetic [5]

Table 2.1: Distributed arithmetic table

2.2.6 Journals/transactions referred Table 2.3: Literature review a table summary

L. Fujcikl, A. S. Kuncheva, T. Mougel, and R. Vrbal

Fp Fstop Fs Apass Astop Df Fds

100Hz 200Hz 62.5KHz 0.00005dB 120dB 100 625Hz

4 Hardware Modelling and FPGA Implementation of FIR Filter

Cascaded integrator-COMB filter FIR filter

Device utilization summary:

You might also like