You are on page 1of 14


Seminar report

Contraction Mapping: An Important Property In Adaptive Filters


Srinivas Doddi (132070001)


1) Abstract. 2) Introduction. 3) Adaptive Filters A. Transversal Filter B. Configurations. 4) Contraction Mapping 5) Theorem A. Proof B. Example C. LMS Algorithm
D. convergence in LMS

6) Conclusion. 7) References

1. ABSTRACT In this paper we show that many adaptive filters used for system identification are contraction mappings. Applying deterministic methods we give conditions under which algorithms, like Least Mean Square, Normalized Least Mean Square, Modified Least Mean Square with Delayed update, Modified Filtered-X Least Mean Square, Affine Projection, and Recursive Least Square are a contraction mapping contracting.

2. INTRODUCTION Digital signal processing (DSP) has been a major player in the current technical advancements such as noise filtering, system identification, and voice prediction. Standard DSP techniques, however, are not enough to solve these problems quickly and obtain acceptable results. Adaptive filtering techniques must be implemented to promote accurate solutions and a timely convergence to that solution.. An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. The closed loop adaptive filter uses feedback in the form of an error signal to refine its transfer function.

3. Adaptive Filter Discrete-time (or digital) filters are ubiquitous in todays signal processing applications. Filters are used to achieve desired spectral characteristics of a signal, to reject unwanted signals, like noise or interferers, to reduce the bit rate in signal transmission, etc. The notion of making filters adaptive, i.e., to alter parameters (coefficients) of a filter according to some algorithm, tackles the problems that we might not in advance know, e.g., the characteristics of the signal, or of the unwanted signal, or of a systems influence on the signal that we like to compensate. Adaptive filters can adjust to unknown environment, and even track signal or system characteristics.

4). Adaptive Transversal Filters In a transversal filter of length N, as depicted in fig. 1, at each time n the output sample y[n] is computed by a weighted sum of the current and delayed input samples x[n], x[n 1]..
N 1

c*knxn k
k 0

Here, the ck[n] are time dependent filter coefficients (we use the complex conjugated coefficients c k[n] so that the derivation of the adaption algorithm is valid for complex signals, too). This equation re-written in vector form, using

x[n] = x[n], x[n 1], . . . , x[n N + 1] _T, the tap -input vector at time n, and c[n] = c0[n], c1[n], . . . , cN1[n] the coefficient vector at time n, is y[n] = c[n]x[n]. Both x[n] and c[n] are column vectors of length N, c[n] = (c)T [n] is the hermitian of vector c[n] (each element is conjugated , and the column vector is transposed T into a row vector)

In the special case of the coefficients c[n] not depending on time n: c[n] = c the transversal filter structure is an FIR filter of length N. Here, we will, however, focus on the case that the filter coefficients are variable, and are adapted by an adaptation algorithm.

Adaptive Filtering System Configurations There are four major types of adaptive filtering configurations; adaptive system identification, adaptive noise cancellation, adaptive linear prediction, and adaptive inverse system. All of the above systems are similar in the implementation of the algorithm, but

different in system configuration. All 4 systems have the same general parts; an input x(n), a desired result d(n), an output y(n), an adaptive transfer function w(n), and an error signal e(n) which is the difference between the desired output u(n) and the actual output y(n). In addition to these parts, the system identification and the inverse system configurations have an unknown linear system u(n) that can receive an input and give a linear output to the given input. Adaptive System Identification Configuration The adaptive system identification is primarily responsible for determining a discrete estimation of the transfer function for an unknown digital or analog system. The same input x(n) is applied to both the adaptive filter and the unknown system from which the outputs are compared The output of the adaptive filter y(n) is subtracted from the output of the unknown system resulting in a desired signal d(n). The resulting difference is an error signal e(n) used to manipulate the filter coefficients of the adaptive system trending towards an error signal of zero

Adaptive Noise Cancellation Configuration The second configuration is the adaptive noise cancellation configuration as shown in figure 2. In this configuration the input x(n), a noise source N1(n), is compared with a desired signal d(n), which consists of a signal s(n) corrupted by another noise N0(n). The adaptive filter coefficients adapt to cause the error signal to be a noiseless version of the signal s(n).

Adaptive Linear Prediction Configuration Adaptive linear prediction is the third type of adaptive configuration (see figure 3). This configuration essentially performs two operations. The first operation, if the output is taken from the error signal e(n), is linear prediction. The adaptive filter coefficients are being trained to predict, from the statistics of the input signal x(n), what the next input signal will be. The second operation, if the output is taken from y(n), is a noise filter similar to the adaptive noise cancellation outlined in the previous section.

As in the previous section, neither the linear prediction output nor the noise cancellation output will converge to an error of zero. This is true for the linear prediction output because if the error signal did converge to zero, this would mean that the input signal x(n) is entirely deterministic, in which case we would not need to transmit any information at all.

Adaptive Inverse System Configuration The goal of the adaptive filter here is to model the inverse of the unknown system u(n). The way this filter works is as follows. The input x(n) is sent through the unknown filter u(n) and then through the adaptive filter resulting in an output y(n). The input is also sent through a delay to attain d(n). As the error signal is converging to zero, the adaptive filter coefficients w(n) are converging to the inverse of the unknown system u(n).

Contraction Mapping Theorem Here we prove a very important fixed point theorem. Definition1. Let f : X X be a map of a metric space to itself. A point a X is called fixed point of f if f(a) = a. Recall that a metric space (X, d) is said to be complete if every Cauchy sequence in X converges to a point X. Definition2. Let (X, dX) and (Y, dY ) be metric spaces. A map : X Y is called contraction if there exists a positive number c<1 such that dY ((x), (y)) cdX (x, )y for allx, y X. Theorem1 (Contraction mapping theorem) Let (X, d) be a complete metric space. If :X X is a contraction, then has unique fixed point. a

Proof. By definition of contraction, there exists a number c(0, 1) such that d((x), (y)) cd(x, y). (1) Let a0 X be an arbitrary point, and define a sequence a inductively by setting an+1 = (an). We claim that n is Cauchy. To see this, first note that for any 1, we have by (1) d that (n+1 , an) = d((an), (an1)) cd(an, an1), and so we can check easily by induction that d(an+1, an) cn d(a1, a0) (2) for all n 1. This and the triangle inequality then gives m that > n for 1, d(am, an) d(am, am1) + d(am1, am2) + . . . + d(an+1, an)(cm1+ c m2+ . . . + cn)d(a1, a0) ncd(a1, a0). (3) This shows that d(am, an) 0 as n, m , i.e. that{an} is Cauchy as claimed. Since (X, )d is complete, there exists a X such that an a. Being a contraction, is continuous, and hence (a) = lim (an) = liman+1 = a.

n n Thus a is a fixed point . of If b X is also a fixed point , of then d(a, b) = d((a), (b)) cd(a, b) which implies, since <1, that d(a, b) = 0 and hence a that = b. Thus the fixed point is unique.

Least Mean Squares Gradient Approximation Method Given an adaptive filter with an input x(n), an impulse response w(n) and an output y(n) you will get a mathematical relation for the

transfer function of the system y(n) = w (n)x(n) and x(n) = [x(n), x(n-1), x(n-2), ... , x(n-(N-1))]

where w (n) = [w0(n), w1(n), w2(n) ... wN-1(n)] are the time domain

coefficients for an N order FIR filter. Note in the above equation and throughout a boldface letter represents a vector ant the super script T represents the transpose of a real valued vector or matrix. Using an estimate of the ideal cost function the following equation can be

derived. w(n+1) = w(n) - E[e ](n). In the above equation w(n+1) represents the new coefficient values

for the next time interval, is a scaling factor, and E[e ](n) is the ideal cost function with respect to the vector w(n). From the above formula one can derive the estimate for the ideal cost function w(n+1) = w(n) - e(n)x(n) where e(n) = d(n) - y(n). and

y(n) = x (n)w(n). In the above equation is sometimes multiplied by 2, but here we will assume it is absorbed by the factor. In summary, in the Least Mean Squares Gradient Approximation Method, often referred to as the Method of Steepest Descent, a guess based on the current filter coefficients is made, and the gradient vector, the derivative of the MSE with respect to the filter coefficients, is calculated from the guess. Then a second guess is made at the tap-

weight vector by making a change in the present guess in a direction opposite to the gradient vector. This process is repeated until the derivative of the MSE is zero.

Convergence of the LMS Adaptive Filter The convergence characteristics of the LMS adaptive filter is related to the autocorrelation of the input process as defined by

Rx = E[x(n)x (n)] There are a two conditions that must be satisfied in order for the system to converge. These conditions include: The autocorrelation matrix, Rx, must be positive definite. 0 < < 1/max., where max is the largest eigenvalue of Rx
In addition, the rate of convergence is related to the eigenvalue spread. This is defined using the condition number of Rx, defined as = max/min, where min is the minimum eigenvalue of Rx. The fastest convergence of this system occurs when = 1, corresponding to white noise. This states that the fastest way to train a LMS adaptive system is to use white noise as the training input. As the noise becomes more and more colored, the speed of the training will decrease.

Conclusion Hence we have studied the concept of contraction mapping over various adaptive filters. And case studied how the sequence converges in all the diverse adaptive filters.

References [1] Simon Haykin: Adaptive Filter Theory, Third Edition, PrenticeHall, Inc., Upper Saddle River, NJ, 1996.

[2] Bernard Widrow and Samuel D. Stearns: Adaptive Signal Processing, Prentice-Hall, Inc., Upper Saddle River, NJ, 1985.

[3] Edward A. Lee and David G. Messerschmitt: Digital Communication, Kluwer Academic Publishers, Boston, 1988.

[4] Steven M. Kay: Fundamentals of Statistical Signal Processing Detection Theory, Volume 2, Prentice-Hall, Inc., 1998.