A Project Report On A Time-Varying Convergence Parameter For The LMS Algorithm in The Presence of White Gaussian Noise

A project report on A Time-Varying Convergence Parameter for the LMS Algorithm in the Presence of White Gaussian Noise
ABSTRACT: A novel approach for the least-mean-square (LMS) estimation algorithm is proposed. Rather than using a fixed convergence parameter , this approach utilizes a time-varying LMS parameter . This technique leads to faster convergence and provides reduced mean-squared
error compared to the conventional fixed parameter LMS algorithm. The algorithm has been tested for noise reduction and estimation in narrow-band FM signals corrupted by additive white Gaussian noise.
INTRODUCTION: The LMS algorithm is a well-known adaptive estimation and prediction technique. It has been extensively studied in the literature and widely used in a variety of applications. The performance of the LMS algorithm is highly dependent on the selected convergence parameter and the signal condition. A larger convergence parameter value leads to faster convergence of the LMS algorithm, i.e., convergence of the filter coefficients to their optimal values. After coefficients converge to their optimal value, the convergence parameter should be small for better estimation accuracy . In this project, we propose a time-varying convergence parameter for the LMS
algorithm in a white Gaussian noise environment. A general power decaying law has been studied, however, other time-varying laws could also be applicable. The main idea is to set the convergence parameter to a large value in the initial state in order to speed up the algorithm convergence. As time passes, the parameter will be adjusted to a smaller value so that the adaptive filter will have a smaller mean-squared error. The modified algorithm has been tested for noise reduction and estimation in linear frequency-modulated (LFM) narrowband signals corrupted by additive white Gaussian noise. Simulation results have shown that the modified LMS algorithm has better performance in terms of convergence speed than the conventional LMS algorithm with a constant convergence parameter and the least-mean-squares error close is to the optimal value.
SPEECH SIGNALS: A speech signal consists of three classes of sounds. They are voice, fricative and plosive sounds. Voiced sounds are caused by excitation of the vocal tract with quasi-periodic pulses of airflow. Fricative sounds are formed by constricting the vocal tract and passing air through it, causing Turbulence that result in a noise-like sound. Plosive sounds are created by closing up the vocal tract, building up air behind it then suddenly releasing it; this is heard in the sound made by the letter Figure shows a discrete time representation of a speech signal.
By looking at it as a whole we can tell that it is non-stationary. That is, its mean values vary with time and cannot be predicted using the above mathematical models for random processes. However, a speech signal can be considered as a linear composite of the above three classes of sound, each of these sounds are stationary and remain fairly constant over intervals of the order of 30 to 40 ms. The theory behind the derivations of many adaptive filtering algorithms usually requires the input signal to be stationary. Although speech is non-stationary for all time, it is an assumption that the short term stationary behavior outlined above will prove adequate for the adaptive filters to function as desired
Representation of Speech Signal
SPEECH GENERATION:
Speech generation and recognition are used to communicate between humans and machines. Rather than using your hands and eyes, you use your mouth and ears. This is very convenient when your hands and eyes should be doing something else, such as: driving a car, performing surgery, or (unfortunately) firing your weapons at the enemy. Two approaches are used for computer generated speech: digital recording and vocal tract simulation. In digital recording, the voice of a human speaker is digitized and stored, usually in a compressed form.
During playback, the stored data are uncompressed and converted back into an analog signal. An entire hour of recorded speech requires only about three megabytes of storage, well within the capabilities of even small computer systems. This is the most common method of digital speech generation used today. Vocal tract simulators are more complicated, trying to mimic the physical mechanisms by which humans create speech. The human vocal tract is an acoustic cavity with resonate frequencies determined by the size and shape of the chambers. Sound originates in the vocal tract in one of two basic ways, called voiced and fricative sounds. With voiced sounds, vocal cord vibration produces near periodic pulses of air into the vocal cavities. In comparison, fricative sounds originate from the noisy air turbulence at narrow constrictions, such as the teeth and lips. Vocal tract simulators operate by generating digital signals that resemble these two types of excitation. The characteristics of the resonate chamber are simulated by passing the excitation signal through a digital filter with similar resonances. This approach was used in one of the very early DSP success stories, the Speak & Spell, a widely sold electronic learning aid for children.
SPEECH PRODUCTION: Speech is produced when air is forced from the lungs through the vocal cords and along the vocal tract. The vocal tract extends from the opening in the vocal cords (called the glottis) to the mouth, and in an average man is about 17 cm long. It introduces short-term correlations (of the order of 1 ms) into the speech signal, and can be thought of as a filter with broad resonances called formants. The frequencies of these formants are controlled by varying
the shape of the tract, for example by moving the position of the tongue. An important part of many speech codecs is the modeling of the vocal tract as a short-term filter. As the shape of the vocal tract varies relatively slowly, the transfer function of its modeling filter needs to be updated only relatively infrequently (typically every 20 ms or so). The vocal tract filter is excited by air forced into it through the vocal cords. Speech sounds can be broken into three classes depending on their mode of excitation.
Voiced sounds are produced when the vocal cords vibrate open and closed, thus interrupting the flow of air from the lungs to the vocal tract and producing quasi-periodic pulses of air as the excitation. The rate of the opening and closing gives the pitch of the sound. Varying the shape of, and the tension in, the vocal cords, and the pressure of the air behind them can adjust this. Voiced sounds show a high degree of periodicity at the pitch period, which is typically between 2 and 20 ms. This long-term periodicity can be seen in Figure 1 which shows a segment of voiced speech sampled at 8 kHz. Here the pitch period is about 8 ms or 64 samples.
Unvoiced sounds result when the excitation is a noise-like turbulence produced by forcing air at high velocities through a constriction in the vocal tract while the glottis is held open. Such sounds show little long-term periodicity as can be seen from Figures 3 and 4 although short-term correlations due to the vocal tract are still present.
Plosive sounds result when a complete closure is made in the vocal tract, and air pressure is built up behind this closure and released suddenly. Some sounds cannot be considered to fall into any one of the three classes above,
but are a mixture. For example voiced fricatives result when both vocal cord vibration and a constriction in the vocal tract are present. Although there are many possible speech sounds which can be produced, the shape of the vocal tract and its mode of excitation change relatively slowly, and so speech can be considered to be quasi-stationary over short periods of time (of the order of 20 ms). Speech signals show a high degree of predictability, due sometimes to the quasi-periodic vibrations of the vocal cords and also due to the resonances of the vocal tract. Speech coders attempt to exploit this predictability in order to reduce the data rate necessary for good quality voice transmission From the technical, signal-oriented point of view, the production of speech is widely described as a two-level process. In the first stage the sound is initiated and in the second
stage it is filtered on the second level. This distinction between phases has its orgin in the sourcefilter model of speech production.
Fig: Source Filter Model of Speech Production
The basic assumption of the model is that the source signal produced at the glottal level is linearly filtered through the vocal tract. The resulting sound is emitted to the surrounding air through radiation loading (lips). The model assumes that source and filter are independent of each other. Although recent findings show some interaction between the vocal tract and a glottal source (Rothenberg 1981; Fant 1986), Fant's theory of speech production is still used as a framework for the description of the human voice, especially as far as the articulation of vowels is concerned.
SPEECH PROCESSING: The term speech processing basically refers to the scientific discipline concerning the analysis and processing of speech signals in order to achieve the best benefit in various practical scenarios. The field of speech processing is, at present, undergoing a rapid growth in terms of both performance and applications. The advances being made in the field of microelectronics, computation and algorithm design stimulate this. Nevertheless, speech processing still covers an extremely broad area, which relates to the following three engineering applications: Speech Coding and transmission that is mainly concerned with man-to man voice communication; Speech Synthesis which deals with machine-to-man communications; Speech Recognition relating to man-to machine communication.
Speech Coding: Speech coding or compression is the field concerned with compact digital representations of speech signals for the purpose of efficient transmission or storage. The central objective is to represent a signal with a minimum number of bits while maintaining perceptual quality. Current applications for speech and audio coding algorithms include cellular and personal communications networks (PCNs), teleconferencing, desktop multi-media systems, and secure communications. SPEECH SYNTHESIS: The process that involves the conversion of a command sequence or input text (words or sentences) into speech waveform using algorithms and previously coded speech data is known as speech synthesis. The inputting of text can be processed through by keyboard, optical character recognition, or from a previously stored database. A speech synthesizer can be characterized by the size of the speech units they concatenate to yield the output speech as well as by the method used to code, store and synthesize the speech. If large speech units are involved, such as phrases and sentences, high-quality output speech (with large memory requirements) can be achieved. On the contrary, efficient coding methods can be used for reducing memory needs, but these usually degrade speech quality.
NOISE SOURCES: Sources of noise exist throughout the environment. One type of noise is due to turbulence and is therefore totally random and impossible to predict. Engineers like to look at signals, noise included, in the frequency domain. That is, "How is the noise energy distributed as a function of frequency?" These turbulent noises tend to distribute their energy evenly across the frequency bands and are therefore referred to as "Broadband Noise". Very commonly we come across a word white noise white noise comes under the category of Broadband Noise. White Noise is a noise having a frequency spectrum that is continuous and uniform over a specified frequency band. Note: White noise has equal power per hertz over the specified frequency band. (Synonym additive white Gaussian noise) Examples of broadband noise are the low frequency noise from jet planes and the impulse noise of an explosion.
A large number of environmental noises are different. These "Narrow Band Noises" concentrate most of their noise energy at specific frequencies. When the source of the noise is a rotating or repetitive machine, the noise frequencies are all multiples of a basic "Noise Cycle" and the noise is approximately periodic. This "Tonal Noise" is common in the environment as man made machinery tends to generate it (along with a smaller amount of broadband noise) at increasingly high levels.
Filter design
Filter design is the process of designing a filter (in the sense in which the term is used in signal processing, statistics, and applied mathematics), often a linear shift-invariant filter, which satisfies a set of requirements, some of which are contradictory. The problem is to find a realization of the filter which meets each of the requirements to a sufficient degree to make it useful. The filter design process can be described as an optimization problem where each requirement contributes with a term to an error function which should be minimized. Certain parts of the design process can be automated, but normally an experienced electrical engineer is needed to get a good result.
Typical design requirements:

Typical requirements which are considered in the design process are:

The filter should have a specific frequency response The filter should have a specific impulse response The filter should be causal The filter should be stable The filter should be localized The computational complexity of the filter should be low The filter should be implemented in particular hardware or software
The frequency function:

Typical examples of frequency function are"

A low-pass filter is used to cut unwanted high-frequency signals. A high-pass filter passes high frequencies fairly well; it is helpful as a filter to cut any unwanted low frequency components.
A band-pass filter passes a limited range of frequencies. A band-stop filter passes frequencies above and below a certain range. A very narrow band-stop filter is known as a notch filter.
A low-shelf filter passes all frequencies, but increasing or reducing frequencies below the cutoff frequency by specified amount.
A high-shelf filter passes all frequencies, but increasing or reducing frequencies above the cutoff frequency by specified amount.
A peak EQ filter makes a peak or a dip in the frequency response, commonly used in graphic equalizers.
An all-pass filter passes through all frequencies unchanged, but changes the phase of the signal. This is a filter commonly used in phaser effects.
An important parameter is the required frequency response. In particular, the steepness and complexity of the response curve is a deciding factor for the filter order and feasibility. A first order recursive filter will only have a single frequency-dependent component. This means that the slope of the frequency response is limited to 6 dB per octave. For many purposes, this is not sufficient. To achieve steeper slopes, higher order filters are required. In relation to the desired frequency function, there may also be an accompanying weighting function which describes, for each frequency, how important it is that the resulting frequency function approximates the desired one. The larger weight, the more important is a close approximation.
The impulse response

There is a direct correspondence between the filter's frequency function and its impulse response, the former is the Fourier transform of the latter. This means that any requirement on the frequency function is a requirement on the impulse response, and vice versa. However, in certain applications it may be the filter's impulse response which is explicit and the design process then aims at producing as close an approximation as possible to the requested impulse response given all other requirements. In some cases it may even be relevant to consider a frequency function and impulse response of the filter which are chosen independently from each other. For example, we may both want a specific frequency function of the filter and that the resulting filter have a small effective width in the signal domain as possible. The latter condition can be realized by considering a very narrow function as the wanted impulse response of the filter even though this function has no relation to the desired frequency function. The goal of the design process is then to realize a filter which tries to meet both these contradicting design goals as much as possible.
Causality
In order to be implementable, any time-dependent filter must be causal: the filter response only depends on the current and past inputs. A standard approach is to leave this requirement until the final step. If the resulting filter is not causal, it can be made causal by introducing an appropriate time-shift (or delay). If the filter is a part of a larger system (which it normally is) these types of delays have to be introduced with care since they affect the operation of the entire system.
Stability
A stable filter assures that every limited input signal produces a limited filter response. A filter which does not meet this requirement may in some situations prove useless or even harmful. Certain design approaches can guarantee stability, for example by using only feedforward circuits such as an FIR filter. On the other hand, filter based on feedback circuits have
other advantages and may therefore be preferred, even if this class of filters include unstable filters. In this case, the filters must be carefully designed in order to avoid instability.
Locality
In certain applications we have to deal with signals which contain components which can be described as local phenomena, for example pulses or steps, which have certain time duration. A consequence of applying a filter to a signal is, in intuitive terms, that the duration of the local phenomena is extended by the width of the filter. This implies that it is sometimes important to keep the width of the filter's impulse response function as short as possible. According to the uncertainty relation of the Fourier transform, the product of the width of the filter's impulse response function and the width of its frequency function must exceed a certain constant. This means that any requirement on the filter's locality also implies a bound on its frequency function's width. Consequently, it may not be possible to simultaneously meet requirements on the locality of the filter's impulse response function as well as on its frequency function. This is a typical example of contradicting requirements.
Computational complexity
A general desire in any design is that the number of operations (additions and multiplications) needed to compute the filter response is as low as possible. In certain applications, this desire is a strict requirement, for example due to limited computational resources, limited power resources, or limited time. The last limitation is typical in real-time applications. There are several ways in which a filter can have different computational complexity. For example, the order of a filter is more or less proportional to the number of operations. This means that by choosing a low order filter, the computation time can be reduced. For discrete filters the computational complexity is more or less proportional to the number of filter coefficients. If the filter has many coefficients, for example in the case of
multidimensional signals such as tomography data, it may be relevant to reduce the number of coefficients by removing those which are sufficiently close to zero. Another issue related to computational complexity is separability, that is, if and how a filter can be written as a convolution of two or more simpler filters. In particular, this issue is of importance for multidimensional filters, e.g., 2D filter which are used in image processing. In this case, a significant reduction in computational complexity can be obtained if the filter can be separated as the convolution of one 1D filter in the horizontal direction and one 1D filter in the vertical direction. A result of the filter design process may, e.g., be to approximate some desired filter as a separable filter or as a sum of separable filters.
Other considerations
It must also be decided how the filter is going to be implemented:

Analog filter Analog sampled filter Digital filter Mechanical filter
Analog filters
The design of linear analog filters is for the most part covered in the linear filter section.
Digital filters
Digital filters are classified into one of two basic forms, according to how they respond to an unit impulse:
Finite impulse response, or FIR, filters express each output sample as a weighted sum of the last N inputs, where N is the order of the filter. Since they do not use feedback, they are inherently stable. If the coefficients are symmetrical (the usual case), then such a filter is linear phase, so it delays signals of all frequencies equally. This is important in many applications. It is also straightforward to avoid overflow in an FIR filter. The main disadvantage is that they may require
significantly more processing and memory resources than cleverly designed IIR variants. FIR filters are generally easier to design than IIR filters - the Remez exchange algorithm is one suitable method for designing quite good filters semi-automatically. (See Methodology.)
Infinite impulse response, or IIR, filters are the digital counterpart to analog filters. Such a filter contains internal state, and the output and the next internal state are determined by a linear combination of the previous inputs and outputs (in other words, they use feedback, which FIR filters normally do not). In theory, the impulse response of such a filter never dies out completely, hence the name IIR, though in practice, this is not true given the finite resolution of computer arithmetic. IIR filters normally require less computing resources than an FIR filter of similar performance. However, due to the feedback, high order IIR filters may have problems with instability, arithmetic overflow, and limit cycles, and require careful design to avoid such pitfalls. Additionally, since the phase shift is inherently a non-linear function of frequency, the time delay through such a filter is frequency-dependent, which can be a problem in many situations. 2nd order IIR filters are often called 'biquads' and a common implementation of higher order filters is to cascade biquads. A useful reference for computing biquad coefficients is the RBJ Audio EQ Cookbook.
Sample rate
Unless the sample rate is fixed by some outside constraint, selecting a suitable sample rate is an important design decision. A high rate will require more in terms of computational resources, but less in terms of anti-aliasing filters. Interference[disambiguation needed] and beating with other signals in the system may also be an issue.
Anti-aliasing
For any digital filter design, it is crucial to analyze and avoid aliasing effects. Often, this is done by adding analog anti-aliasing filters at the input and output, thus avoiding any frequency component above the Nyquist frequency. The complexity (i.e., steepness) of such filters depends on the required signal to noise ratio and the ratio between the sampling rate and the highest frequency of the signal.
Theoretical basis
Parts of the design problem relate to the fact that certain requirements are described in the frequency domain while others are expressed in the signal domain and that these may contradict. For example, it is not possible to obtain a filter which has both an arbitrary impulse response and arbitrary frequency function. Other effects which refer to relations between the signal and frequency domain are

The uncertainty principle between the signal and frequency domains The variance extension theorem The asymptotic behaviour of one domain versus discontinuities in the other
The uncertainty principle

As stated in the uncertainty principle, the product of the width of the frequency function and the width of the impulse response cannot be smaller than a specific constant. This implies that if a specific frequency function is requested, corresponding to a specific frequency width, the minimum width of the filter in the signal domain is set. Vice versa, if the maximum width of the response is given, this determines the smallest possible width in the frequency. This is a typical example of contradicting requirements where the filter design process may try to find a useful compromise.
The variance extension theorem
Let
be the variance of the input signal and let , is then given by
be the variance of the filter. The variance
of the filter response,
This means that r > f and implies that the localization of various features such as pulses or steps in the filter response is limited by the filter width in the signal domain. If a precise localization is requested, we need a filter of small width in the signal domain and, via the uncertainty principle, its width in the frequency domain cannot be arbitrary small.
Discontinuities versus asymptotic behaviour

Let f(t) be a function and let F() be its Fourier transform. There is a theorem which states that if the first derivative of F which is discontinuous has order like t n 1. A consequence of this theorem is that the frequency function of a filter should be as smooth as possible to allow its impulse response to have a fast decay, and thereby a short width. , then f has an asymptotic decay
Methodology
One common method for designing FIR filters is the Remez exchange algorithm. Here the user specifies a desired frequency response, a weighting function for errors from this response, and a filter order N. The algorithm then finds the set of N coefficients that minimize the maximum deviation from the ideal. Intuitively, this finds the filter that is as close as you can get to the desired response given that you can use only N coefficients. This method is particularly easy in practice and at least one text[1] includes a program that takes the desired filter and N and returns the optimum coefficients. One possible drawback to filters designed this way is that they contain many small ripples in the passband(s), since such a filter minimizes the peak error. Another method to finding a discrete FIR filter is filter optimization described in Knutsson et al., which minimizes the integral of the square of the error, instead of its maximum value. In its basic form this approach requires that an ideal frequency function of the filter FI() is specified together with a frequency weighting function W() and set of coordinates xk in the signal domain where the filter coefficients are located. An error function is defined as
where f(x) is the discrete filter and
is the discrete-time Fourier transform defined on the
specified set of coordinates. The norm used here is, formally, the usual norm on L2 spaces. This means that measures the deviation between the requested frequency function of the filter, FI, . However, the deviation is also
and the actual frequency function of the realized filter,
subject to the weighting function W before the error function is computed. Once the error function is established, the optimal filter is given by the coefficients f(x) which minimize
2
. This can be done by solving the corresponding least squares problem. In
practice, the L norm has to be approximated by means of a suitable sum over discrete points in the frequency domain. In general, however, these points should be significantly more than the number of coefficients in the signal domain to obtain a useful approximation.
Simultaneous optimization in both domains

The previous method can be extended to include an additional error term related to a desired filter impulse response in the signal domain, with a corresponding weighting function. The ideal impulse response can be chosen independently of the ideal frequency function and is in practice used to limit the effective width and to remove ringing effects of the resulting filter in the signal domain. This is done by choosing a narrow ideal filter impulse response function, e.g., an impulse, and a weighting function which grows fast with the distance from the origin, e.g., the distance squared. The optimal filter can still be calculated by solving a simple least squares problem and the resulting filter is then a "compromise" which has a total optimal fit to the ideal functions in both domains. An important parameter is the relative strength of the two weighting functions which determines in which domain it is more important to have a good fit relative to the ideal function.
FINITE IMPULSE RESPONSE: A finite impulse response (FIR) filter is a type of a digital filter. The impulse response, the filter's response to a Kronecker delta input, is finite because it settles to zero in a finite number of sample intervals. This is in contrast to infinite impulse response (IIR) filters, which have internal feedback and may continue to respond indefinitely. The impulse response of an Nth-order FIR filter lasts for N+1 samples, and then dies to zero.
The difference equation that defines the output of an FIR filter in terms of its input is:
where:

x[n] is the input signal, y[n] is the output signal, bi are the filter coefficients, and N is the filter order an Nth-order filter has (N + 1) terms on the right-hand side; these are commonly referred to as taps.
This equation can also be expressed as a convolution of the coefficient sequence bi with the input signal:
That is, the filter output is a weighted sum of the current and a finite number of previous values of the input.
An FIR filter has a number of useful properties which sometimes make it preferable to an infinite impulse response (IIR) filter. FIR filters: 1.Are inherently stable. This is due to the fact that all the poles are located at the origin and thus are located within the unit circle. 2.Require no feedback. This means that any rounding errors are not compounded by summed iterations. The same relative error occurs in each calculation. This also makes implementation simpler. 3.They can easily be designed to be linear phase by making the coefficient sequence symmetric; linear phase, or phase change proportional to frequency, corresponds to equal delay at all frequencies. This property is sometimes desired for phase-sensitive applications, for example crossover filters, and mastering. The main disadvantage of FIR filters is that considerably more computation power is required compared to an IIR filter with similar sharpness or selectivity, especially when low frequencies (relative to the sample rate) cutoffs are needed. IMPULSE RESPONSE: The impulse response h[n] can be calculated if we set in the above relation,
where [n] is the Kronecker delta impulse. The impulse response for an FIR filter then becomes the set of coefficients bn, as follows
for
to
The Z-transform of the impulse response yields the transfer function of the FIR filter
FIR filters are clearly bounded-input bounded-output (BIBO) stable, since the output is a sum of a finite number of finite multiples of the input values, so can be no greater than times the largest value appearing in the input. FILTER DESIGN To design a filter means to select the coefficients such that the system has specific characteristics. The required characteristics are stated in filter specifications. Most of the time filter specifications refer to the frequency response of the filter. There are different methods to find the coefficients from frequency specifications: 1. Window design method 2. Frequency Sampling method 3. Weighted least squares design 4. Parks-McClelland method (also known as the Equiripple, Optimal, or Minimax method). The Remez exchange algorithm is commonly used to find an optimal equiripple set of coefficients. Here the user specifies a desired frequency response, a weighting function for errors from this response, and a filter order N. The algorithm then finds the set of (N + 1) coefficients that minimize the maximum deviation from the ideal. Intuitively, this finds the filter that is as close as you can get to the desired response given that you can use only (N + 1) coefficients. This method is particularly easy in practice since at least one text[1] includes a program that takes the desired filter and N, and returns the optimum coefficients.
Software packages like MATLAB, GNU Octave, Scilab, and SciPy provide convenient ways to apply these different methods. Some of the time, the filter specifications refer to the time-domain shape of the input signal the filter is expected to "recognize". The optimum matched filter is to sample that shape and use those samples directly as the coefficients of the filter -- giving the filter an impulse response that is the time-reverse of the expected input signal. WINDOW DESIGN METHOD In the Window Design Method, one designs an ideal IIR filter, then applies a window function to it in the time domain, multiplying the infinite impulse by the window function. This results in the frequency response of the IIR being convolved with the frequency response of the window function[2] thus the imperfections of the FIR filter (compared to the ideal IIR filter) can be understood in terms of the frequency response of the window function. The ideal frequency response of a window is a Dirac delta function, as that results in the frequency response of the FIR filter being identical to that of the IIR filter, but this is not attainable for finite windows, and deviations from this yield differences between the FIR response and the IIR response.
MOVING AVERAGE EXAMPLE
Block diagram of a simple FIR filter (2nd-order/3-tap filter in this case, implementing a moving average) A moving average filter is a very simple FIR filter. The filter coefficients are found via the following equation:
for To provide a more specific example, we select the filter order:
The impulse response of the resulting filter is:
The following figure shows the block diagram of such a 2nd-order moving-average filter. To discuss stability and spectral topics we take the z-transform of the impulse response:
The following figure shows the pole-zero diagram of the filter. Two poles are located at the
origin, and two zeros are located at
The frequency response, for frequency in radians per sample, is:
The following figure shows the absolute value of the frequency response. Clearly, the movingaverage filter passes low frequencies with a gain near 1, and attenuates high frequencies. This is a typical low-pass filter characteristic. Frequencies above are aliases of the frequencies below , and are generally ignored or filtered out if reconstructing a continuous-time signal.
The following figure shows the phase response.
INFINITE IMPULSE RESPONSE
Infinite impulse response (IIR) is a property of signal processing systems. Systems with this property are known as IIR systems or, when dealing with filter systems, as IIR filters. IIR systems have an impulse response function that is non-zero over an infinite length of time. This is in contrast to finite impulse response filters (FIR), which have fixed-duration impulse responses. The simplest analog IIR filter is an RC filter made up of a single resistor (R) feeding into a node shared with a single capacitor (C). This filter has an exponential impulse response characterized by an RC time constant.
IIR filters may be implemented as either analog or digital filters. In digital IIR filters, the output feedback is immediately apparent in the equations defining the output. Note that unlike with FIR filters, in designing IIR filters it is necessary to carefully consider "time zero" case in which the outputs of the filter have not yet been clearly defined. Design of digital IIR filters is heavily dependent on that of their analog counterparts because there are plenty of resources, works and straightforward design methods concerning analog feedback filter design while there are hardly any for digital IIR filters. As a result, usually, when a digital IIR filter is going to be implemented, an analog filter (e.g. Chebyshev filter, Butterworth filter, Elliptic filter) is first designed and then is converted to a digital filter by applying discretization techniques such as Bilinear transform or Impulse invariance. Example IIR filters include the Chebyshev filter, Butterworth filter, and the Bessel filter.
TRANSFER FUNCTION DERIVATION Digitals filters are often described and implemented in terms of the difference equation that defines how the output signal is related to the input signal:
where:

is the feedforward filter order are the feedforward filter coefficients is the feedback filter order are the feedback filter coefficients is the input signal
is the output signal.
A more condensed form of the difference equation is:
which, when rearranged, becomes:
To find the transfer function of the filter, we first take the Z-transform of each side of the above equation, where we use the time-shift property to obtain:
We define the transfer function to be:
Considering that in most IIR filter designs coefficient the more traditional form:
is 1, the IIR filter transfer function takes
DESCRIPTION OF BLOCK DIAGRAM
Simple IIR filter block diagram A typical block diagram of an IIR filter looks like the following. The z 1 block is a unit delay. The coefficients and number of feedback/feedforward paths are implementation-dependent. Stability The transfer function allows us to judge whether or not a system is bounded-input, boundedoutput (BIBO) stable. To be specific, the BIBO stability criteria requires the ROC of the system include the unit circle. For example, for a causal system, all poles of the transfer function have to have an absolute value smaller than one. In other words, all poles must be located within a unit circle in the z-plane. The poles are defined as the values of z which make the denominator of H(z) equal to 0:
Clearly, if
then the poles are not located at the origin of the z-plane. This is in contrast to
the FIR filter where all poles are located at the origin, and is therefore always stable.
IIR filters are sometimes preferred over FIR filters because an IIR filter can achieve a much sharper transition region roll-off than FIR filter of the same order. Example Let the transfer function of a filter H be
with ROC a < | z | and 0 < a < 1 which has a pole at a, is stable and causal. The time-domain impulse response is h(n) = anu(n) which is non-zero for n > = 0.
ADAPTIVE FILTER An adaptive filter is a filter that self-adjusts its transfer function according to an optimizing algorithm. Because of the complexity of the optimizing algorithms, most adaptive filters are digital filters that perform digital signal processing and adapt their performance based on the input signal. By way of contrast, a non-adaptive filter has static filter coefficients (which collectively form the transfer function). For some applications, adaptive coefficients are required since some parameters of the desired processing operation (for instance, the properties of some noise signal) are not known in advance. In these situations it is common to employ an adaptive filter, which uses feedback to refine the values of the filter coefficients and hence its frequency response. Generally speaking, the adapting process involves the use of a cost function, which is a criterion for optimum performance of the filter (for example, minimizing the noise component of
the input), to feed an algorithm, which determines how to modify the filter coefficients to minimize the cost on the next iteration. As the power of digital signal processors has increased, adaptive filters have become much more common and are now routinely used in devices such as mobile phones and other communication devices, camcorders and digital cameras, and medical monitoring equipment.
EXAMPLE Suppose a hospital is recording a heart beat (an ECG), which is being corrupted by a 50 Hz noise (the frequency coming from the power supply in many countries). One way to remove the noise is to filter the signal with a notch filter at 50 Hz. However, due to slight variations in the power supply to the hospital, the exact frequency of the power supply might (hypothetically) wander between 47 Hz and 53 Hz. A static filter would need to remove all the frequencies between 47 and 53 Hz, which could excessively degrade the quality of the ECG since the heart beat would also likely have frequency components in the rejected range. To circumvent this potential loss of information, an adaptive filter could be used. The adaptive filter would take input both from the patient and from the power supply directly and would thus be able to track the actual frequency of the noise as it fluctuates. Such an adaptive technique generally allows for a filter with a smaller rejection range, which means, in our case, that the quality of the output signal is more accurate for medical diagnoses. BLOCK DIAGRAM The block diagram, shown in the following figure, serves as a foundation for particular adaptive filter realisations, such as Least Mean Squares (LMS) and Recursive Least Squares (RLS). The idea behind the block diagram is that a variable filter extracts an estimate of the desired signal.
To start the discussion of the block diagram we take the following assumptions:
The input signal is the sum of a desired signal d(n) and interfering noise v(n) x(n) = d(n) + v(n)
The variable filter has a Finite Impulse Response (FIR) structure. For such structures the impulse response is equal to the filter coefficients. The coefficients for a filter of order p are defined as
The error signal or cost function is the difference between the desired and the estimated signal
The variable filter estimates the desired signal by convolving the input signal with the impulse response. In vector notation this is expressed as
where
is an input signal vector. Moreover, the variable filter updates the filter coefficients at every time instant
where
is a correction factor for the filter coefficients. The adaptive algorithm generates
this correction factor based on the input and error signals. LMS and RLS define two different coefficient update algorithms. APPLICATIONS OF ADAPTIVE FILTERS

Noise cancellation Signal prediction Adaptive feedback cancellation Echo cancellation
v LEAST MEAN SQUARES (LMS): Least mean squares (LMS) algorithms is a type of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). It is a stochastic gradient descent method in that the filter is only adapted based on the error at the current time. It was invented in 1960 by Stanford University professor Bernard Widrow and his first Ph.D. student, Ted Hoff.
PROBLEM FORMULATION:
Most linear adaptive filtering problems can be formulated using the block diagram above. That is, an unknown system filter is to be identified and the adaptive filter attempts to adapt the , while using only observable signals x(n),
to make it as close as possible to
d(n) and e(n); but y(n), v(n) and h(n) are not directly observable. Its solution is closely related to the Wiener filter. The idea behind LMS filters is to use steepest descent to find filter weights minimize a cost function. We start by defining the cost function as which
where e(n) is the error at the current sample 'n' and E{.} denotes the expected value. Applying steepest descent means to take the partial derivatives with respect to the individual entries of the filter coefficient (weight) vector
where
is the gradient operator. With
and
it follows
Now,
is a vector which points towards the steepest ascent of the cost function.
To find the minimum of the cost function we need to take a step in the opposite direction of . To express that in mathematical terms
where
is the step size. That means we have found a sequential update algorithm which
minimizes the cost function. Unfortunately, this algorithm is not realizable until we know . SIMPLIFICATIONS: For most systems the expectation function This can be done with the following unbiased estimator must be approximated.
where N indicates the number of samples we use for that estimate. The simplest case is N = 1
For that simple case the update algorithm follows as
Indeed this constitutes the update algorithm for the LMS filter. LMS ALGORITHM : The LMS algorithm for a pth order algorithm can be summarized as Parameters: p = filter order = step size Initialisation: Computation: For n = 0,1,2,...
where
denotes the Hermitian transpose of
LEAST MEAN SQUARE ALGORITHM:

Least mean squares (LMS) algorithms are used in adaptive filters to find the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). It is a stochastic gradient descent method in which the filter is adaptive based on the error at the current time. It was invented in 1960 by Stanford University professor Bernard Widrow and his first Ph.D. student, Ted Hoff. The adaptive linear combiner output, is a linear combination of the input samples. The error in measurement is given by
where is the transpose vector of input samples.
To develop an adaptive algorithm ,it is required to estimate the gradient of =E[] by taking differences between short term averages of .Instead, to develop the LMS algorithm process,is taken as the estimate of Thus at each iteration in the adaptive process a gradient estimate form is
as follows With this simple estimate the steepest descent type of adaptive algorithm is specified as
This is the LMS algorithm. Where is the gain constant that regulates the speed and stability of adaptation. Since the weight changes at each iteration are based on imperfect gradient estimates, the adaptive process is expected to be noisy. The LMS algorithm can be implemented without squaring, averaging or differentiation and is simple and efficient process.
CONVERGENCE OF WEIGHT VECTOR: As with all adaptive Algorithms, the primary concern with the LMS Algorithm is its convergence to the weight vector solution, where error E [] is minimized. NORMALISED LEAST MEAN SQUARES FILTER (NLMS): The main drawback of the "pure" LMS algorithm is that it is sensitive to the scaling of its input x(n). This makes it very hard (if not impossible) to choose a learning rate that guarantees stability of the algorithm.. The Normalised least mean squares filter (NLMS) is a variant of the
LMS algorithm that solves this problem by normalising with the power of the input. The NLMS algorithm can be summarised as: Parameters: p = filter order = step size Initialization: Computation: For n = 0,1,2,...
OPTIMAL LEARNING RATE: It can be shown that if there is no interference (v(n) = 0), then the optimal learning rate for the NLMS algorithm is opt = 1 and is independent of the input x(n) and the real (unknown) impulse response general case with interference ( ), the optimal learning rate is . In the
The results above assume that the signals v(n) and x(n) are uncorrelated to each other, which is generally the case in practice. PROOF:
Let the filter misalignment be defined as expected misalignment for the next sample as:
, we can derive the
Let
and
Assuming independence, we have:
The optimal learning rate is found at
, which leads to:
MEAN SQUARED ERROR: In statistics, the mean square error or MSE of an estimator is one of many ways to quantify the difference between an estimator and the true value of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or quadratic loss. MSE measures the average of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate. The MSE is the second moment (about the origin) of the error, and thus incorporates both the variance of the estimator and its bias. For an unbiased estimator, the MSE is the variance. Like the variance, MSE has the same unit of measurement as the square of the quantity being estimated. In an analogy to standard deviation, taking the square root of MSE yields the root mean squared error or RMSE, which has the same units as the quantity being estimated; for an unbiased estimator, the RMSE is the square root of the variance, known as the standard error.
DEFINITION AND BASIC PROPERTIES: The MSE of an estimator with respect to the estimated parameter is defined as
The MSE is equal to the sum of the variance and the squared bias of the estimator
The MSE thus assesses the quality of an estimator in terms of its variation and unbiasedness. Note that the MSE is not equivalent to the expected value of the absolute error.
Since MSE is an expectation, it is a scalar, and not a random variable. It may be a function of the unknown parameter , but it does not depend on any random quantities. However, when MSE is computed for a particular estimator of the true value of which is not known, it will be subject to estimation error. In a Bayesian sense, this means that there are cases in which it may be treated as a random variable. ALTERNATIVE USAGE: The term mean squared error is sometimes used to refer to residual sum of squares, divided by the number of observations. This is an observed quantity, whereas the definition above is a function of an unknown parameter. For more details, see errors and residuals in statistics. EXAMPLES: Suppose we have a random sample of size n from an identically distributed population, . Some commonly-used estimators of the true parameters of the population, and 2, are shown in the following table (see notes for distribution requirements for the MSEs in the table related to variance estimators).
True valu Estimator e Mean squared error
= the unbiased estimator of the = population mean,
= the unbiased estimator of the 2 = population variance,
= the biased estimator of the

2
= population
variance,
= the biased estimator of the 2 = population variance,
1. The MSEs shown for the variance estimators assume
i.i.d. so that
. The result for that is 2n 2.
follows easily from the
variance
2. The general MSE expression for the unbiased variance estimator, without distribution , where 4 is the fourth central
assumptions, is moment.[3]
3. Unbiased estimators may not produce estimates with the smallest total variation (as measured by MSE): 's MSE is larger than 's MSE.
4. Estimators with the smallest total variation may produce biased estimates: underestimates 2 by INTERPRETATION:
typically
An MSE of zero, meaning that the estimator predicts observations of the parameter with perfect accuracy, is the ideal and forms the basis for the least squares method of regression analysis. While particular values of MSE other than zero are meaningless in and of themselves, they may be used for comparative purposes. Two or more statistical models may be compared using their MSEs as a measure of how well they explain a given set of observations: The unbiased model with the smallest MSE is generally interpreted as best explaining the variability in the observations. Both linear regression techniques such as analysis of variance estimate the MSE as part of the analysis and use the estimated MSE to determine the statistical significance of the factors or predictors under study. The goal of experimental design is to construct experiments in such a way that when the observations are analyzed, the MSE is close to zero relative to the magnitude of at least one of the estimated treatment effects. MSE is also used in several stepwise regression techniques as part of the determination as to how many predictors from a candidate set to include in a model for a given set of observations. APPLICATIONS:
Minimizing MSE is a key criterion in selection estimators. Among unbiased estimators, the minimal MSE is equivalent to minimizing the variance, and is obtained by the MVUE. However, a biased estimator may have lower MSE; see estimator bias.
In statistical modelling, the MSE is defined as the difference between the actual observations and the response predicted by the model and is used to determine whether the model does not fit the data or whether the model can be simplified by removing terms
NOISE: In common use, the word noise means any unwanted sound. In both analog and digital electronics, noise is an unwanted perturbation to a wanted signal; it is called noise as a generalisation of the audible noise heard when listening to a weak radio transmission. Signal noise is heard as acoustic noise if played through a loudspeaker; it manifests as 'snow' on a television or video image. In signal processing or computing it can be considered unwanted data without meaning; that is, data that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities. In Information Theory, however, noise is still considered to be information. In a broader sense, film grain or even advertisements encountered while looking for something else can be considered noise. In biology, noise can describe the variability of a measurement around the mean, for example transcriptional noise describes the variability in gene activity between cells in a population. Noise can block, distort, change or interfere with the meaning of a message in both human and electronic communication. In many of these areas, the special case of thermal noise arises, which sets a fundamental lower limit to what can be measured or signaled and is related to basic physical processes at the molecular level described by well-established thermodynamics considerations, some of which are expressible by simple formulae.
GAUSSIAN NOISE:
Gaussian noise is statistical noise that has a probability density function (abbreviated pdf) of the normal distribution (also known as Gaussian distribution). In other words, the values that the noise can take on are Gaussian-distributed. It is most commonly used as additive white noise to yield additive white Gaussian noise (AWGN).
ADDITIVE WHITE GAUSSIAN NOISE (AWGN): Additive white Gaussian noise (AWGN) is a channel model in which the only impairment to communication is a linear addition of wideband or white noise with a constant spectral density (expressed as watts per hertz of bandwidth) and a Gaussian distribution of amplitude. The model does not account for fading, frequency selectivity, interference, nonlinearity or dispersion. However, it produces simple and tractable mathematical models. which are useful for gaining insight into the underlying behavior of a system before these other phenomena are considered. Wideband Gaussian noise comes from many natural sources, such as the thermal vibrations of atoms in conductors (referred to as thermal noise or Johnson-Nyquist noise), shot noise, black body radiation from the earth and other warm objects, and from celestial sources such as the Sun. The AWGN channel is a good model for many satellite and deep space communication links. It is not a good model for most terrestrial links because of multipath, terrain blocking, interference, etc. However, for terrestrial path modeling, AWGN is commonly used to simulate background noise of the channel under study, in addition to multipath, terrain blocking, interference, ground clutter and self interference that modern radio systems encounter in terrestrial operation.
LINEAR FREQUENCY MODULATION (FM) :
Till now we've seen signals that do not change in frequency over time. How do we modify the signal to obtain a time-varying frequency?
A chirp signal is one that sweeps linearly from a low to a high frequency. Can we create such a signal by concatenating small sequences, each with a frequency that is higher than the last?
This approach will likely lead to problems lining up the phase of each segment so that discontinuities aren't introduced in the resulting waveform (as seen below).
Figure : A signal made by concatenating sinusoids of different frequencies will result in discontinuities if care is not taken to match the initial phase. THE PROPOSED ALGORITHM: In the conventional LMS algorithm, the weight vector coefficients w(n) for the FIR filter are updated according tothe formula. w(n) = w(n -1) + e(n)y(n) (1)
where w(n) = [wo(n) w1(n):::wM(n)] (M+1 being the filter length), is the convergence parameter (sometimes referred to as step-size), e(n) = d(n) - z(n) is the output error (z(n) being the filter output), and d(n) is the reference signal). Note that z(n) = w(n-1)y T (n) = where
is the original signal and y(n) = [y(n) y(n - 1):::y(n - M)] is the input signal to the filter.
For the algorithm to be useful for a range of FM signals with different bandwidths (including single-tone sinusoids), we first specify the centre frequency fm in the spectrum of interest. The conventional LMS algorithm is then used (with a singletone of frequency fm) to find an optimal value of at that frequency. This optimal value timevarying convergence parameter is used to update the
according to the following formula (2)
where
is a decaying factor. We will consider the following decaying law:
(3) where C, a, b are positive constants that will determine the magnitude and the rate of decrease for C = 1, . According to the above law, C has to be a positive number larger than 1. When
will be equal to 1 and the new algorithm will be the same as the conventional LMS
algorithm. A summary of the time-varying LMS algorithm is shown below: Z(n)=W(n-1) e(n)=d(n)-z(n) (n) (4) (5) (6) (7) W(n)=W(n-1)+ (8)
Fig. 1. Spectrum for LFM narrowband signal with fo = 100 Hz, Bandwidth = 100 Hz and Ts = 0.001.
ADAPTIVE FILTERS:
Adaptive filtering techniques to reduce this unwanted echo, thus increasing communication quality. These echoes can be very annoying to callers. A widely used technique to suppress echoes is to employ adaptive echo cancellers.
ADAPTIVE ECHO CANCELLERS:
A technique to remove or cancel echoes is shown in Figure. The echo canceller mimics the transfer function of the echo path (or room acoustic) to synthesize a replica of the echo, and then subtracts that replica from the combined echo and near-end speech (or disturbance) signal to obtain the near end signal alone. However, the transfer function is unknown in practice, and so it must be identified. The solution to this problem is to use an adaptive filter the method used to cancel the echo signal is known as adaptive filtering.
Adaptive filters are dynamic filters which iteratively alter their characteristics in order to achieve an optimal desired output. An adaptive filter algorithmically alters its parameters in order to minimize a unction of the difference between the desired output d(n) and its actual output y(n). This function is known as the cost function of the adaptive algorithm. Figure shows a block diagram of the adaptive echo cancellation system implemented throughout this thesis. Here the filter H(n) represents the impulse response of the acoustic environment, W(n) represents the adaptive filter used to cancel the echo signal. The adaptive filter aims to equate its output y(n) to the desired output d(n) (the signal reverberated within the acoustic environment). At each iteration the error signal, e(n)=d(n)-y(n), is fed back into the filter, where the filter characteristics are altered accordingly.
Block diagram of Adaptive Echo Canceller
CHOICE OF ALGORITHM: A wide variety of recursive algorithms have been developed in the literature for the operation of linear adaptive filters, In the final analysis, the choice of one algorithm over another is determined by one or more of the following factors
RATE OF CONVERGENCE:
This is defined as the number of iterations required for the algorithm, in response to stationary inputs, to converge close enough to the optimum wiener solution in the mean-square
error sense. A fast rate of convergence allows the algorithm to adapt rapidly to a stationary environment of unknown statistics.
MISS ADJUSTMENT:
For an algorithm of interest, this parameter provides a quantitative measure of the amount which the final value of the mean-square error, averaged over an ensemble of adaptive filters, deviates from the minimum mean-square error produced by the Wiener filter.
TRACKING: When an adaptive filtering algorithm operates in a non-stationary environment. The algorithm is required to track statistical variations in the environment. Two contradictory features, however, influence the tracking performance of the algorithm
(1) Rate of convergence, and (2) steady-state fluctuation due to algorithm noise.
ROBUSTNESS:
For an adaptive filter to be robust, small disturbances (I.e., disturbances with small energy) can only result in small estimation errors. The disturbances may arise from a variety of factors, internal or external to the filter. COMPUTATIONAL REQUIREMENTS:
Here the issues of concern include
(a) The number of operations (i.e., multiplications, divisions, and additions/ subtractions) Required to make one complete iteration of the algorithm. (b) The size of memory locations required to store the data and the program, (c) The investment required to program the algorithm on a computer.
APPROACH TO DEVELOP LINEAR ADAPTIVE FILTER STOCHASTIC GRADIENT APPROACH:
The stochastic gradient approach uses a tapped-delay line, or transversal filter, as the structural basis for implementing the linear adaptive filter. For the case of stationary inputs, the cost function, also referred to as the index of performance, is defined as the mean square error (i.e., the mean square value of the difference between the desired response and the transversal filter output). This cost function is precisely a second order function of the tap weights in the transversal filter.
To develop a recursive algorithm for updating the tap weights of the adaptive transversal filter, we proceed in two stages, First, we use an iterative procedure to solve the Wiener Hopf equations (i.e., the Matrix equation defining the optimum Wiener solution); the iterative procedure is based on the method of steepest descent, which is a well known technique in optimization theory. This method required the use of a gradient vector, the value of which depends on two parameters: the correlation Matrix of the tap inputs in the transversal filter and the cross correlation vector between the desired response and the same tap inputs. Next, we use instantaneous values for this correlation, so as to derive an estimate for the gradient vector, making it assume a stochastic character in general.
The resulting algorithm is widely known as the least mean square (LMS) algorithm, , the essence of which for the case of a transversal filter operating on real valued data may be described as
Where the error signal is defined as the difference between some desired response and the actual response of the transversal filter produced by the tap input vector.
LEAST MEAN SQUARE (LMS) ALGORITHM:
The Least Mean Square (LMS) algorithm was first developed by Widrow and Hoff in1959 through their studies of pattern recognition. From there it has become one of the most widely used algorithms in adaptive filtering. The LMS algorithm is an important member of the family of stochastic gradient-based algorithms as it utilizes the gradient vector of the filter tap weights to converge on the optimal wiener solution. It is well known and widely used due to its computational simplicity. It is this simplicity that has made it the benchmark against which all other adaptive filtering algorithms are judged.
The LMS algorithm is a linear adaptive filter algorithm, which in general consists of two basic processes.
1. A filter process: This involves a. Computing the output of a linear filter in response to an input signal. b. Generating an estimation error by comparing this output with a desired response.
2.
An
adaptive
process
which
involves
the
automatic
adjustment
of
the
Parameters of the filter in accordance with the estimation error.
The combination of these two processes working together constitutes a feedback loop; First, we have a transversal filter, around which the LMS algorithm is built. This component is responsible for performing the filtering process. Second, we have a mechanism for performing the adaptive control process on the tap weights of the transversal filter. With each iteration of the LMS algorithm, the filter tap weights of the adaptive filter are updated according to the following formula (Farhang-Boroujeny 1999).
w (n +1) = w(n) + 2ex(n)
Here x(n) is the input vector of time delayed input values, x(n) = [x(n) x(n-1) x(n-2) .x(n- N+1)]
The vector represents the coefficients of the adaptive FIR filter tap weight vector at time n. The parameter is known as the step size parameter and is a small positive constant. This step size parameter controls the influence of the updating factor. Selection of a suitable value for is imperative to the performance of the LMS algorithm, if the value is too small the time the adaptive filter takes to converge on the optimal solution will be too long; if is too large the adaptive filter becomes unstable and its output diverges. Is the simplest to implement and is stable when the step size parameter is selected appropriately. This requires prior knowledge of the input signal which is not feasible for the echo cancellation system.
DERIVATION OF THE LMS ALGORITHM:
The derivation of the LMS algorithm builds upon the theory of the wiener solution for the optimal filter tap weights, Wo . It also depends on the steepest-descent algorithm. This is a formula which updates the filter coefficients using the current tap weight vector and the current gradient of the cost function with respect to the filter tap weight coefficient vector ,
As the negative gradient vector points in the direction of steepest descent for the Ndimensional quadratic cost function, each recursion shifts the value of the filter coefficients closer toward their optimum value, which corresponds to the minimum achievable value of the cost function, (n).The LMS algorithm is a random process implementation of the steepest
descent algorithm. Here the expectation for the error signal is not known so the instantaneous value is used as an estimate. The steepest descent algorithm then becomes
The gradient of the cost function,(n), can alternatively be expressed in the following form.
Substituting this into the steepest descent algorithm of Eqn , we arrive at the recursion for the LMS adaptive algorithm. w (n +1) = w(n) + 2ex(n)
IMPLEMENTATION OF THE LMS ALGORITHM:
Each iteration of the LMS algorithm requires 3 distinct steps in this order: 1. The output of the FIR filter, y(n) is calculated using equation
2. The value of the error estimation is calculated using equation
3. The tap weights of the FIR vector are updated in preparation for the next iteration,
by The main reason for the LMS algorithms popularity in adaptive filtering is its Computational simplicity, making it easier to implement than all other commonly use adaptive algorithms. For each iteration the LMS algorithm requires 2Nadditions and 2N+1 multiplications (N for calculating the output, y(n), one for 2e(n) and an additional N for the scalar by vector multiplication).
SIMULATION RESULTS: In this section, we shall present simulation results to evaluate the performance of the proposed algorithm using Matlab. In this simulation, the input signal for both algorithms has the form y(t) = x(t) + n(t), n(t) being white Gaussian noise with 1dB power and x(t) is the original signal assumed to be a finite-length LFM signal of the form
X(t)=cos(
(9)
where
=2
fo is a constant (initial frequency), taken here as 100 Hz, T is the signal duration,
and is the modulation index which will determine the bandwidth of LFM signal. Fig. (1) shows the spectrum for an LFM narrowband signal that will be used in the simulation, where fm is its mean frequency. The bandwidth BW of this LFM signal can be adjusted by varying the parameter . Increasing will increase the signal bandwidth, as can be
numerically shown using the relationships

| | | |
dw | |
(10) (11)
BW=
where X(f) is the Fourier transform of x(t). The mean squared error (MSE) for each convergence parameter is calculated as follows: MSE= [ ] (12) is the filter output, which represents an
where x(n) is the original signal and estimate of the input signal.
Fig. (2) shows the MSE for different LFM signals using the conventional and the time varying LMS algorithms. The performance of the conventional LMS algorithm varies depending on the LFM signal bandwidth. However, the optimal value for in the case of LFM is still
located in the lower region, within the range of 0.0001 to 0.0005. As a result, if the algorithm is to be used with LFM signals with the same centre frequency (but probably with different bandwidths), we should choose from this range for the time-varying LMS algorithm.
Fig. 2. MSE for conventional LMS algorithm (filter order = 100, Ts = 0.001, SNR = 1 dB, fo = 100 Hz).
Fig. 3. The effect of parameter C on MSE for time-varying LMS algorithm (filter order = 100, Ts = 0.001, SNR = 1 dB, fo = 100 Hz, LFM bandwidth = 50 Hz).
Fig. (3) shows the relationship between the C and
using a 100-order filter for the time-
varying LMS algorithm used for noise reduction in an LFM narrowband signal of 50 Hz bandwidth. Please note that Fig. (3) can be divided into two regions, small region ranging from 0.0002 to 0.0006) and large region (optimal
region ( larger than 0.006). In the optimal
region, the time-varying LMS algorithm will always provide optimal value close to that achieved by the conventional LMS algorithms. The parameter C does not only affect the MSE but it also affects the convergence time as shown in Fig. (5), Fig. (9) and Fig. (10). Fig. (3) also shows that the time-varying LMS algorithm works better when the convergence parameter is smaller than the optimal for the conventional LMS algorithm.
Fig. (4) shows the performances of conventional LMS alogrithm and time-varying LMS algorithm for a LFM narrowband signal bandwidth of 50 Hz and a single-tone signal of 125 Hz (close to the mean frequency of LFM signal). Fig. (4) shows that the optimal _ for the two
signals are different. The optimal 0.2*
for a single-tone sinusoid is in the range 0.15*
to
, and for an LFM narrowband signal of 50 Hz is around 0.4* Fig. (5) shows the estimation curve when the time-varying LMS algorithm is used for
noise reduction in narrowband FM signals. The curve in Fig. (5) is the estimated output error e(n) from equation (5). In general, the time-varying LMS algorithm provides faster convergence than the conventional LMS algorithm (C = 1). Fig. (5) also shows the effect of the parameter C as a convergence controlling factor. Larger C will provide faster convergence. Fig. (6), Fig. (7), and Fig. (8) show the mean-squared error versus the number of samples N for LFM signals with different bandwidths and different values of C. These figures show that the time-varying LMS algorithm provides better MSE. It can be concluded that the time-varying LMS algorithm provides better MSE performance for a larger bandwidth. Fig. (9) and Fig. (10) show the convergence of the time-varying LMS algorithm to a limit of MSE = 0.05 using LFM signal with a bandwidth of 50 Hz and MSE = 0.08 for another LFM signal with a bandwidth of 100 Hz. Fig. (9) and Fig. (10) also show that the time-varying LMS algorithm provides faster convergence for larger C.
Fig. 4. MSE performance comparison for narrowband LFM of 50 Hz and signal-tone at 125 Hz for both conventional LMS and timevarying LMS algorithm (filter order = 100, Ts = 0.001, SNR = 1 dB).
Fig. 5. The effect of parameter C on estimation error curve using the time-varying LMS for noise reduction in narrowband signals (filter order = 10, _o = 0.001, SNR = 2 dB, fo = 100 Hz, and LFM bandwidth = 50 Hz).
Fig. 6. MSE vs number of samples for different C values (filter order = 100, _o = 0.0002, SNR = 1 dB , single-tone at 125 Hz, a = 0.01 , and b =0.7).
Fig. 7. MSE vs Number of sample for differenre C used (filter order = 100, _o = 0.0002, SNR = 1 dB , fo = 100, LFM bandwidth = 50 Hz, a = 0.01, b = 0.7).
CONCLUSIONS: A new structure for the LMS algorithm with a decaying, time-varying law for the convergence parameter has been proposed. In a stationary white Gaussian noise environment, simulations show that the time-varying LMS algorithm provides faster convergence than the conventional LMS algorithm and a smaller mean-squared error (MSE) which is close to the optimal value. The algorithm is based on selecting the optimal value of the convergence parameter using a single tone sinusoid with a frequency that equals the centre frequency of expected LFM signals, assuming they are narrowband. The best decay controlling factor is bandwidth-dependent. Further study of different decay laws is needed to extend the algorithm to deal with non-linear FM signals.

A Project Report On A Time-Varying Convergence Parameter For The LMS Algorithm in The Presence of White Gaussian Noise

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Project Report On A Time-Varying Convergence Parameter For The LMS Algorithm in The Presence of White Gaussian Noise

Uploaded by

Copyright:

Available Formats

A project report on A Time-Varying Convergence Parameter for the LMS Algorithm in the Presence of White Gaussian Noise

Representation of Speech Signal

Fig: Source Filter Model of Speech Production

Typical design requirements:

The frequency function:

The impulse response

Analog filter Analog sampled filter Digital filter Mechanical filter

The uncertainty principle

The variance extension theorem

be the variance of the input signal and let , is then given by

be the variance of the filter. The variance

of the filter response,

Discontinuities versus asymptotic behaviour

where f(x) is the discrete filter and

is the discrete-time Fourier transform defined on the

and the actual frequency function of the realized filter,

. This can be done by solving the corresponding least squares problem. In

Simultaneous optimization in both domains

MOVING AVERAGE EXAMPLE

for To provide a more specific example, we select the filter order:

The impulse response of the resulting filter is:

origin, and two zeros are located at

The frequency response, for frequency in radians per sample, is:

The following figure shows the phase response.

INFINITE IMPULSE RESPONSE

is the output signal.

A more condensed form of the difference equation is:

which, when rearranged, becomes:

We define the transfer function to be:

is 1, the IIR filter transfer function takes

DESCRIPTION OF BLOCK DIAGRAM

Noise cancellation Signal prediction Adaptive feedback cancellation Echo cancellation

to make it as close as possible to

is the gradient operator. With

For that simple case the update algorithm follows as

denotes the Hermitian transpose of

LEAST MEAN SQUARE ALGORITHM:

where is the transpose vector of input samples.

, we can derive the

Assuming independence, we have:

The optimal learning rate is found at

, which leads to:

True valu Estimator e Mean squared error

= the unbiased estimator of the = population mean,

= the unbiased estimator of the 2 = population variance,

= the biased estimator of the

= the biased estimator of the 2 = population variance,

1. The MSEs shown for the variance estimators assume

. The result for that is 2n 2.

follows easily from the

LINEAR FREQUENCY MODULATION (FM) :

according to the following formula (2)

is a decaying factor. We will consider the following decaying law:

ADAPTIVE ECHO CANCELLERS:

Block diagram of Adaptive Echo Canceller

Here the issues of concern include

APPROACH TO DEVELOP LINEAR ADAPTIVE FILTER STOCHASTIC GRADIENT APPROACH:

LEAST MEAN SQUARE (LMS) ALGORITHM:

Parameters of the filter in accordance with the estimation error.

w (n +1) = w(n) + 2ex(n)

DERIVATION OF THE LMS ALGORITHM:

IMPLEMENTATION OF THE LMS ALGORITHM: