FPGA para Filtragem

International Review on Computers and Software (I.RE.CO.S.), Vol. 5, N.
4
July 2010
An FPGA-Based Implementation of Fixed-Point Standard-LMS

Algorithm with Low Resource Utilization and Fast Convergence
Omid Sharifi Tehrani1, Mohsen Ashourian2, Payman Moallem3
Abstract – An FPGA-based fixed-point standard-LMS algorithm core is proposed for adaptive

signal processing (ASP) realization in real time. The LMS core is designed in VHDL93 language
as basis of FIR adaptive filter. FIR adaptive filters are mostly used because of their low
computation costs and linear phase. The proposed model uses 12-bit word-length for input data
from analog to digital converter (ADC) chip while internal computations are based on 17-bit
word-length because of considering guard bits to prevent overflow. The designed core is FPGA-
brand-independent so that it can be implemented on any brand to create a system-on-
programmable-chip (SoPC). In this paper, XILINX SPARTAN3E and VIRTEX4 FPGA series are
used as implementation platform. Rounding errors were inevitable due to limited word-length and
can be decreased by adjusting the dynamic range of input signal amplitude. A comparison is made
between DSP, Hardware/Software co-design and pure-hardware implementations. Obtained
results show improvements in area-resource utilization, convergence speed and performance in
the designed pure-hardware LMS core. Although using a pure-hardware implementation results
in high performance, it is much more complex than other structures. Copyright © 2010 Praise
Worthy Prize S.r.l. - All rights reserved.
Keywords: FIR Adaptive Filter, FPGA, LMS Algorithm Core, SoPC
Nomenclature Although ASIC chips can provide a solution to meet

K
hard constraints, it lacks the flexibility that is available in
O
Learning rate (Step-size) the others. Using field programmable gate arrays
'
Eigen-values of input data correlation matrix (FPGAs) can reduce the gap between flexibility and high
Gradient performance. New FPGAs include many primitives that
M Number of filter taps provide DSP applications such as embedded multipliers,
S Power spectral density (PSD) multiply and accumulate units (MAC), digital clock
U LMS algorithm input vector management (DCM), DSP-Blocks, and soft/hard
u(n) Input signal value processor cores (such as PPC). These facilities are
Y LMS algorithm output vector embedded in FPGA fabric and optimized for high
y(n) Output signal value performance applications and low power consumption.
W LMS algorithm weight vector The availability of soft/hard processor cores in new
E Estimation error FPGAs allows implementation of DSP algorithms
e(n) Error signal value without difficulty. An alternative choice is to move some
D Desired reference vector parts of the algorithm into hardware (HW) to improve
d(n) Desired signal value performance. This is called HW/SW co-design. This
MSE Mean square error solution would result in a more efficient implementation
as part of the algorithm is accelerated using HW while
the flexibility is maintained. Another more efficient and
I. Introduction
more complex choice is to convert the whole algorithm
Recently requests for portable and embedded digital into hardware as a pure HW implementation. Although
signal processing (DSP) systems have been increased this is an attractive option under area, speed,
dramatically. Applications such as audio devices, hearing performance and power consumption, the design will be
aids, cell phones, active noise control systems with much more complex [2]. Studies on LMS algorithm
constraints such as speed, area and power consumption mainly concentrate on two aspects. One is the
need an implementation by which these constraints are convergence time from the theoretical perspective;
met with shortest time to market [1]. Some possible several modified LMS algorithms were proposed in
solutions are ASIC chips, general purpose processor reference [3]-[4]. The other is hardware implementation,
(GPP) and digital signal processor (DSP). in order to improve data throughput, many modified
Manuscript received and revised June 2010, accepted July 2010 Copyright © 2010 Praise Worthy Prize S.r.l. - All rights reserved
436
Omid Sharifi Tehrani, Mohsen Ashourian, Payman Moallem
architectures for LMS algorithm such as pipeline There are some methods for performing weight update
technique were proposed in reference [5]-[6]. This paper on an adaptive filter. There is the Wiener filter, which is
can be classified into the latter. the optimum linear filter in terms of mean squared error,
In this paper we first describe the theory of adaptive and several algorithms that try to approximate it, such as
signal processing and LMS algorithms. Then in sections the method of steepest descent. There is also least mean
III and IV, data entry problem in LMS algorithm and square algorithm for use in Artificial Neural Networks
description of designed fixed point Standard-LMS (ANN). Finally, there are other techniques such as the
algorithm are given respectively. Section V shows recursive least squares algorithm and the Kalman filter.
simulation-implementation results and in section VI, a The choice of algorithm is highly dependent on the
comparison is made with other works. At last, section signals of interest and the working environment, as well
VII describes conclusions from the obtained results. as the convergence speed required and computation
complexity available.
The least mean square (LMS) algorithm is similar to
II. LMS Algorithm method of steepest descent in that it updates the weights
Adaptive filters learn the characteristics of their by iteratively approaching the MSE minimum. Widrow
environment and continually adjust their parameters and Hoff invented this method in 1960 for use in neural
accordingly. Because of their ability to perform well in network training. The key is that instead of calculating
unknown environments and track statistical time the gradient at every time interval, the LMS algorithm
variations, adaptive filters are employed in a wide area of uses a rough approximation of the gradient. The error at
fields. The adjustable parameters that are dependent on the filter output can be expressed as (1):
d n WnT un
the applications are the number of filter taps, selection of
FIR or IIR, choice of training algorithm, and the en (1)
convergence speed (learning rate). Beyond these, the
underlying architecture needed for realization is This is simply the desired output minus the filter
application independent. output. By using this definition for en , an approximation
of ' is found by (2):
The main goal of any filter is to extract useful
information from noisy data. Whereas a normal fixed
' 2en un
filter is designed in advance with knowledge of the
(2)
statistics of both the clear signal and the unwanted noise,
Substituting (4) for ' into the weight update (1) from
the adaptive filter continually adjusts to a changing
environment by the use of recursive algorithms [2]-[7].
This is useful when the characteristics of the signals are steepest descent method gives (3):
Wn 2K en un
not known before of change with time.
Wn 1 (3)
d(n)
Desired That is the Widrow-Hoff LMS algorithm. As with the
for values of K less than the reciprocal of OMax , but

steepest descent algorithm, it can be shown to converge
OMax may be time varying and to avoid computing it,

u(n) LMS-Based y(n)
¦
Input Output +
FIR Filter
another criterion (4) can be used:
w ,w 1 ,... -
0 K
0
2
(4)
MS Max
e(n)
Estimation Error
where S Max is related to the tap inputs u. The good
Fig. 1. Block Diagram of Adaptive Filtering Problem
performance of the LMS algorithm and its simplicity has
caused it to be the most widely used algorithm in
The discrete adaptive filter in Fig. 1 receives u(n) and
practice. For an N tap filter, the number of operations has
produces y(n) by a convolution with filters weights w(k).
been reduced to 2*N multiplications and N additions per
Then d(n) is compared to y(n) to get e(n). This signal is
coefficient update. This is suitable for real time
used to incrementally adjust the filters coefficients for
applications and is the reason of LMS algorithm
the next time instant. Several algorithms exist for weight
In Normalized LMS [8]-[9], the gradient step factor K
popularity.
update, such as the Least Mean Square (LMS) and the
Recursive Least Squares (RLS) algorithms. The selection
of algorithm is dependent on required convergence speed is normalized by the energy of the data vector. NLMS
and the computational complexity available, as statistics usually converges much faster than LMS at little extra
of the operating environment. cost. In this paper, Standard-LMS algorithm is used.
Copyright © 2010 Praise Worthy Prize S.r.l. - All rights reserved International Review on Computers and Software, Vol. 5, N. 4
437
III. Data Entry Problem of LMS discarded. This were called calculation noise, this
Algorithm problem also is a quantization error. This noise can slow
down the convergence speed, divergence of weight
Data entry problem is an important issue in LMS vectors and even lead the entire system to be collapsed.
algorithm. Data should be converted into binary forms in In order to make results more accurate, some measures
order to be processed by digital systems. In fixed-point can be taken into account. Appropriate algorithm
digital systems, data problems mainly involve binary structure can reduce the word-length effect and
code representation, limited word-length selection, appropriate word-length can reduce the calculation noise.
rounding and overflow. For implementation in hardware, long word-length
numbers utilizes more resources than short word-length
numbers. Performance and resource utilization should be
III.1. Binary Code Representation
balanced according to the requirement of the designed
There are several binary code representations. system [11].
Performances of one system based on different
representations are different. Subtractions are included in
LMS algorithm so results might be negative. For this III.3. Overflow
reason, data should be represented as signed binary Errors can also be produced when overflow happens.
codes. These errors can slow down the convergence speed.
The most popular representation of signed binary Overflow can be avoided by two methods: extending the
codes is two’s complement. Calculations of LMS word-length of the accumulator and scaling data before
algorithm are mainly done in decimals so decimals calculation. The latter can be realized by shifting.
should be converted into integers firstly. When - Extending the word-length of the accumulator.
initializing weight vector values, initial values should be When the word-length of input vectors and weight
the values after conversion [10]. vectors is N, the number range that can be represented
will be [ 2 N 1 , 2 N 1 1 ] if the two’s complement is
III.2. Limited Word-Length Selection adopted. The bigger N, the range will be wider. But big
value means more prices.
In digital systems, every number can be represented - Scaling: Scaling can be realized by shifting
by a binary code in limited word-length, so dynamic operation. Because the binary codes are the two’s
range and precision are finite. For LMS algorithm, the complement, the sign bit should be settled appropriately.
effect of limited word-length is that it will produce three When left shifting, the sign bit is not changed, others are
errors: quantization errors of input vectors, quantization left shifted. Most significant bits are discarded and zeros
errors of weight vectors and quantization errors of are left shifted in. When right shifting, the least
calculation. significant bits are discarded and sign bit is right shifted
- Natural signals are analog signals that can not be in.
processed by digital systems. In order to perform this Final output results should be descaled, that is shifting
task, analog signals should be converted into digital in reversed direction [12].
signals by using analog to digital converter (ADC).
Samplings of the ADC are represented in limited word-
length. There are differences between actual values and IV. Fixed-Point Standard LMS Model
values of representations. These differences are
quantization errors of input vectors. Quantization errors In our developed system, ADC unit provides 12 bit
can be reduced by improving the sampling precision of binary output data. Table I shows the LMS input data
the ADC. model used in our system. The one bit fraction length is
- Initial values of weight vectors also have to be dummy and not used for input/output data, but it is
represented by binary codes. Weight vectors are necessary for weight updates.
quantized according to the limited word-length.
TABLE I
Quantization errors of weight vectors are produced in INPUT DATA BIT-ALLOCATION
this process. Quantization errors of weight vectors can Sign Bit Guard Word Length Fraction Length
cause many problems such as actual results of filters Bits
deviate from theoretic results and so degrading the 1 3 12 1
performance.
- Multiplication is one arithmetic operation in LMS Table II shows the LMS weight bit-allocation.
algorithm. Rounding is needed in multiplication of two
binary numbers with limited word-length. Two binary TABLE II
WEIGHTS BIT-ALLOCATION
numbers that the length of each is N, then the length of
Sign Bit Word Fraction Length
the multiplication will be 2N. The length of the result Length
should be rounded to N and N bit binary codes should be 1 1 15
438
The output of LMS algorithm (Y) is calculated by:
Y WU (5)
in which, W is LMS weights vector and U is LMS input

vector. The output of LMS would be 34-bit long. To be
in format of 17-bit long, we truncate Y from 31 down to
15. The weights are updated based on Eq. (6):
'W K EU (6)
The resulting length for W is 51-bit long; therefore it

is truncated from 33 down to 17 to fit in the desired
format which is 17-bit long.
IV.1. LMS Entity

The designed LMS entity is composed of reset, clock,
adc_new_data, desired, input and output ports as
illustrated in Fig. 2.
Fig. 2. Entity of Fixed-point Standard LMS Core Fig. 3. Flowchart of LMS-Based FIR Core
Reset, clock and adc_new_data signals are one-wire

ports while the others are 17-bit long ports.
The reset signal is used for resetting the algorithm and
bringing it to initial values. The clock signal should be
connected to processing clock. Because the algorithm
should know when ADC has converted a new data, a
signal called adc_new_data is provided so that it is
asserted for a short time and then de-asserted.
When working, first the output of the core is
calculated, then is compared to the desired value and the
error is obtained. The error is used to update the
coefficients of the filter. This process is done
repetitively. To achieve low resource utilization
(minimum use of internal multipliers, slices, flip-flops
Fig. 4(a). Test Data Creation with Sine-Noise
and LUTs), each computation involving multiplication
and truncation is done in a single clock-cycle process
(serial computing). If all multiplications are done
simultaneously in a process, the number of utilized
multipliers increases dramatically. The flowchart of
designed LMS-based FIR core is illustrated in Fig. 3.
V. Simulation of Proposed Model

The performance of the proposed model is evaluated
by using it as an FIR adaptive noise canceller.
V.1. Test Data Creation

For simulation and verification of designed LMS core,
there should be some test data. MATLAB Simulink was
used to create the essential test data as can be seen in Fig. 4(b). Test Data Creation with Band-Limited White Noise
Figs. 4.
439
Fig. 4(c). Test Data Creation with Uniform-Random number Noise Fig. 5(b). Noisy Input and Filter Output Signals
The Simulink file is consist of signal generators, 12bit

ADC quantizer and etc. test data is created with Simulink
and then is exported to MATLAB workspace. An M-file
is designed to write the results in separate text files to use
in ModelSim VHDL simulator. LMS core VHDL file
and Test Bench file is compiled in ModelSim Simulation
software as VHDL93 standard. At the end of simulation
the output data which has been written in a text file is
brought to MATLAB workspace and converted from
double to int16. Then the required plots are derived.
V.2. Software Simulation Results Fig. 5(c). Residual Error (Learning Curve)
In the first simulation a 200 hertz sine signal which is TABLE III
0.5 and phase of S 3 is considered. The simulation

distorted with a 50 hertz sine signal with amplitude of 1ST SIMULATION PERFORMANCE
Input SNR Output SNR SNR Enhancement
illustrated in Table III. In all simulations, K is

results are shown in Figs. 5 and SNR results are 1.1278 dB 7.5014 dB 6.3736 dB
The pick in 1000th sample of Fig. 6(c) is because of

considered to be 1 215 . unexpected change in noise frequency by which, the new
In the second simulation a 200 hertz sine signal is first weights should be calculated.
0.5 and phase of S 3 and after some time with a 400

degraded with a 600 hertz sine signal with amplitude of In the third simulation a 200 hertz sine signal is
degraded with band limited white noise with 0.015 noise
S 4 . Actually the purpose is to verify the behavior of

hertz sine signal with amplitude of 0.5 and phase of power (PSD).
The results are shown in Figs. 7. SNR results are
illustrated in Table V.
the designed core to changes in noise source
In the forth test a 400 hertz sine signal is distorted
characteristics. The results are given in Figs. 6 and Table
with band limited white noise with 0.04 noise power.
IV.
Fig. 8 and table VI shows the simulation results.
Fig. 5(a). Desired and Filter Output Signals
440
Fig. 6(b). Noisy Input and Filter Output Signals Fig. 7(c). Residual Error (Learning Curve)
TABLE V
3RD SIMULATION PERFORMANCE
1.0831 dB 8.7515 dB 7.6684 dB
Fig. 6(c). Residual Error (Learning Curve)
TABLE IV
2ND SIMULATION PERFORMANCE
1.1399 dB 5.9211 dB 4.7812 dB

Fig. 8(b). Noisy Input and Filter Output Signals
Fig. 7(b). Noisy Input and Filter Output Signals Fig. 8(c). Residual Error (Learning Curve)
441
TABLE VI version 10.1. Resource utilization results are given in

4TH SIMULATION PERFORMANCE
Tables VIII, IX and X. Some part of hardware-
0.9438 dB 6.5507 dB 5.6069 dB
implementation result is shown in Fig. 10.
TABLE VIII
In the last simulation, the designed core is fed with a IMPLEMENTATION-RESOURCE UTILIZATION ON XC3S1600E-5FG320
400 hertz sine signal which is deformed by uniformly Used Available Utilization
distributed random numbers. The simulation results are Slice Flip Flops 2,693 29,504 9%
given in Figs. 9. SNR results are illustrated in Table VII. 4 input LUTs 3,477 29,504 11%
occupied Slices 3,186 14,752 21%
bonded IOBs 54 250 21%
BUFGMUXs 1 24 4%
MULT18X18SIOs 2 36 5%
Maximum Frequency 50.14 MHz
TABLE IX
IMPLEMENTATION-RESOURCE UTILIZATION ON XC4VSX25-12FF668
Used Available Utilization
Slice Flip Flops 2,702 20,480 13%
4 input LUTs 3,445 20,480 16%
BUFG/BUFGCTRLs 1 32 3%
Fig. 9(a). Desired and Filter Output Signals DSP48Es 2 128 1%
TABLE X
IMPLEMENTATION-RESOURCE UTILIZATION ON XC5VLX50T-3FF665
Used Available Utilization
Slice Registers 2,618 28,800 9%
Slice LUTs 2,564 28,800 8%
BUFG/BUFGCTRLs 1 32 3%
DSP48Es 2 48 4%
Fig. 9(b). Noisy Input and Filter Output Signals
Fig. 9(c). Residual Error (Learning Curve)
TABLE VII
5TH SIMULATION PERFORMANCE
1.1573 dB 8.2417 dB 7.0844 dB
V.3. Synthesis-Implementation Results

The proposed LMS-adaptive filter core is synthesized
and implemented on XILINX SPARTAN3E, VIRTEX4
and VIRTEX5 family chips using XILINX ISE software Fig. 10. Some Part of Implemented Hardware on FPGA
442
VI. Discussion on Results References

Authors in [13] implemented an ANC system on both [1] S. M. Kuo and D. R. Morgan, Active Noise Control: A Tutorial
Review, IEEE Proceeding, vol. 78, 1999, pp. 943-973.
XILINX-4VSX25ff668 and ALTERA-EP2S15F484C3.
[2] K.S. Chaitanya, P. Muralidhar and C.C. Rama Rao,
Comparing the results shows that the proposed method in Implementation of reconfigurable adaptive filtering algorithms,
this paper utilizes about 16% less resources on FPGA, International Conference on SIGNAL PROCESSING SYSTEMS,
but maximum frequency is reduced 10 MHZ instead. 2009.
Authors in [14] implemented the LMS adaptive filter on [3] Tyseer Aboulnasr, A robust Variable Step-Size LMS-Type
Algorithm: Analysis and Simulations, IEEE Transaction on
ALTERA Cyclone II FPGA. Their filter fastest Signal Processing, 45 (3), 1997, pp. 631-639.
convergence speed at 1 210 step-size is 1.46 ms. [4] Wee-Peng and B. Farhang-Boroujeny, A New Class of Gradient
Adaptive Step-Size LMS Algorithm, IEEE Transaction on Signal
Comparing this with the averaged convergence speed of Processing, 49(4), 2001, pp. 805-810.
the proposed method implemented on XILINX [5] Katsushige Matsubara and Kiyoshi Nishikawa, A new pipelined
SPARTAN3E (200 us), shows 1460/200=7.3 speedup, architecture of the LMS algorithm without degradation of
convergence characteristics, IEEE ICASSP Conference, 5(4),
and comparing with implementation on XILINX
1997, PP.4125-4128.
VIRTEX4 (125 us), shows 1460/125=11.68 speedup. [6] HU Zheng-wei and Xie Zhi-yuan, Application of Pipeline
Authors in [15] implemented LMS core for ANC system Technique In Realization of LMS Algorithm, Journal of North
on TMS320-C40 DSP processor family. The China Electric Power University, 31(3), 2004, pp. 93-96.
computational time to execute on this DSP processor is [7] S. Haykin, Adaptive Filter Theory (Prentice Hall, Upper Saddle
River, NJ, 2002).
about 6.2 ms. Comparing to proposed model [8] M. Tarrab and A. Feuert, Convergence and Performance Analysis
implemented on SPARTAN3E FPGA, shows of the normalized LMS Algorithm with Uncorrelated Gaussian
6200/200=31 speedup. Authors in [16] implemented a Data, IEEE Transaction on Inform. Theory, 34, (4), 1988, pp.
real-time adaptive noise cancellation based on an 680-691.
[9] A.I. Sulyman and A. Zerguine, Echo cancellation using a variable
improved adaptive wiener filter on Texas Instrument step-size NLMS algorithm, SIGNAL PROCESSING Conference
TMS320C6713 DSK. They reported that floating point (EURASIP), 2004.
implementation of LMS filter takes worst case time of [10] Uroš Legat, Anton Biasizzo and Franc Novak, Partial Runtime
38.95 ms to compute the filtering of heavy sine noisy Reconfiguration of FPGA, Applications and a Fault Emulation
Case Study, International Review on Computers and Software
signal consisting of 4096 samples per frame and the FIR (IRECOS), Vol.4 n. 5, September 2009, pp. 606-611.
filter is 40 taps long. Comparing with our proposed [11] Rabindra Ku. Jena, Pankaj Srivastava and G. K. Sharma,
model with the same number of samples and filter taps Network-On-Chip: On-Chip Communication Solution,
implemented on SPARTAN3E FPGA, we have International Review on Computers and Software (IRECOS),
Vol.5 n. 1, January 2010, pp. 22-33.
38950/6553.6=5.94 speedup. [12] Zheng-wei HU and Zhi-yuan XIE, Modification of theoretical
fixed-point LMS algorithm for implementation in hardware, 2nd
International Symposium on ELECTRICAL COMMERCE AND
VII. Conclusion SECURITY, 2009.
[13] V. Rodellar, A. Alvarez and E. Marinez, FPGA implementation of
A modified fixed-point standard-LMS algorithm for an adaptive noise canceller for robust speech enhancement
FPGA-hardware implementation with low resource interfaces, 4th Southern Conference on PROGRAMMABLE
utilization was proposed. Some data problems were LOGIC, 2008, San Carlos de Bariloche.
[14] VR. Mustafa, M. Mohdali, C. Umat and D. Alasadi, Design and
studied and effective methods were discussed. The filter implementation of least mean square adaptive filter on altera
is based on FIR structure with 50 taps length and 16-bit cyclone II field programmable gate array for active noise control,
signed coefficients. According to the results obtained, the IEEE Symposium on INDUSTRIAL ELECTRONICS
filter have successfully adapted and learned the APPLICATIONS (ISIEA), 2009, Kuala Lumpur, Malaysia.
[15] R. Seifi Majdar and M. Eshghi, Implementation of PBS_LMS
environment statistics with a fast convergence speed. algorithm for adaptive filters on FPGA, IEEE TENCON
Comparisons with other implementations showed better Conference, 2004, Chiang Mai, Thailand.
convergence speed and lower resource utilization on [16] G. Saxena, S. Ganesan and M. Das, Real time implementation of
FPGA. The proposed model is FPGA-brand independent adaptive noise cancellation, IEEE International Conference on
ELECTRO/INFORMATION TECHNOLOGY, 2008, No. 4554341,
so can be implemented on any FPGA brand (XILINX, pp. 431-436.
ALTERA, and ACTEL). Although using a pure-
hardware implementation results in high performance
than software or HW/SW co-design implementation, it is Authors’ information
much more complex and low flexible. Future study 1
M.Sc in telecommunication engineering, IAU Najafabad branch,
would be focused on implementing variable step-size young researchers club of IAU Majlesi branch.
NLMS algorithm (VSS-NLMS) and using internal
2
Block-RAM to reduce FPGA area utilization, more. Assistant professor, Department of Electrical Engineering, Islamic
Azad University Majlesi Branch.
3
Assistant professor, Department of Electrical Engineering, University
Acknowledgements of Isfahan.
This work was supported by HESA aviation
industries.
443
Omid Sharifi Tehrani was born in Isfahan, Payman Moallem was born in Isfahan, Iran,
Iran, 1984 and received B.Sc. degree in 1970 and received B.Sc. degree in electrical
telecommunication engineering from Islamic engineering from Isfahan University of
azad university of Majlesi in 2007 and M.Sc. Technology in 1992, M.Sc. degree in electrical
degree in telecommunication engineering engineering from Amir-Kabir University of
(systems) from Islamic azad university of technology in 1996 and PhD in electrical
Najafabad in 2010. engineering from Amir-Kabir University of
He has published 3 books about microcontrollers technology in 2003.
and FPGAs. His research interests are adaptive signal processing, active His research interests are neural networks, image processing and
noise control, artificial neural networks and FPGA implementations. machine vision.
Mr. Sharifi is an active member of young researchers club (YRC) of Mr. Moallem is an active member of image processing and machine
Islamic Azad University (IAU) and reviewer of MJEE journal. vision society of Iran.
Mohsen Ashourian was born in Isfahan, Iran,

1969 and received B.Sc. degree in electrical
engineering from Isfahan University of
Technology in 1990, M.Sc. degree in
telecommunication engineering from Sharif
University of Technology in 1993 and PhD in
electrical engineering from University
Technology Malaysia in 2001.
His research interests are multimedia signal processing and
transmission.
Mr. Ashourian is an active member of IEEE of Iran.
444

FPGA para Filtragem

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FPGA para Filtragem

Uploaded by

Copyright:

Available Formats

International Review on Computers and Software (I.RE.CO.S.), Vol. 5, N.

An FPGA-Based Implementation of Fixed-Point Standard-LMS

Omid Sharifi Tehrani1, Mohsen Ashourian2, Payman Moallem3

Abstract – An FPGA-based fixed-point standard-LMS algorithm core is proposed for adaptive

Keywords: FIR Adaptive Filter, FPGA, LMS Algorithm Core, SoPC

Nomenclature Although ASIC chips can provide a solution to meet

for values of K less than the reciprocal of OMax , but

OMax may be time varying and to avoid computing it,

The output of LMS algorithm (Y) is calculated by:

in which, W is LMS weights vector and U is LMS input

The resulting length for W is 51-bit long; therefore it

IV.1. LMS Entity

Reset, clock and adc_new_data signals are one-wire

V. Simulation of Proposed Model

V.1. Test Data Creation

The Simulink file is consist of signal generators, 12bit

0.5 and phase of S 3 is considered. The simulation

illustrated in Table III. In all simulations, K is

The pick in 1000th sample of Fig. 6(c) is because of

0.5 and phase of S 3 and after some time with a 400

S 4 . Actually the purpose is to verify the behavior of

Fig. 5(a). Desired and Filter Output Signals

Fig. 6(a). Desired and Filter Output Signals

Fig. 6(c). Residual Error (Learning Curve)

Fig. 7(a). Desired and Filter Output Signals

TABLE VI version 10.1. Resource utilization results are given in

Fig. 9(b). Noisy Input and Filter Output Signals

Fig. 9(c). Residual Error (Learning Curve)

V.3. Synthesis-Implementation Results

VI. Discussion on Results References

Mohsen Ashourian was born in Isfahan, Iran,

You might also like