CHAPTER-1

INTRODUCTION:
1.1 BACK GROUND:
Out of many applications of adaptive filtering, direct modeling and inverse modeling are very
important. The direct modeling or system identification finds applications in control system
engineering including robotics [1], intelligent sensor design [2], process control [3], power
system engineering [4], image and speech processing [4], geophysics [5], acoustic noise and
vibration control [6] and biomedical engineering [7]. Similarly inverse modeling technique is
used in digital data reconstruction [8], channel equalization in digital communication [9], digital
magnetic data recording [10], and intelligent sensor [2], deconvolution of seismic data [11]. The
direct modeling mainly refers to adaptive identification of unknown plants. Simple static linear
plants are easily identified through parameter estimation using conventional derivative based
least mean square (LMS) type algorithms [12]. But most of the practical plants are dynamic,
nonlinear and combination of these two characteristics. In many applications Hammerstein and
MIMO plants need identification. In addition the output of the plant is associated with
measurement or additive white Gaussian noise (AWGN). Identification of such complex plants
is a difficult task and poses many challenging problems. Similarly inverse modeling of
telecommunication and magnetic medium channels are also important for reducing the effect of
inter symbol interference (ISI) and achieving faithful reconstruction of original data. Similarly
adaptive inverse modeling of sensors is required to extend their linearity's for direct digital
readout and enhancement of dynamic range. These two important and complex issues are
addressed in the thesis and attempts have been made to provide improved efficient and alternate
promising solutions.
[1]
The conventional LMS and recursive least square (RLS) [13] techniques work well for
identification of static plants but when the plants are of dynamic type, the existing forward-
backward LMS [14] and the RLS algorithms very often lead to non optimal solution due to
premature convergence of weights to local minima [15]. This is a major drawback of the use of
existing derivative based techniques. To alleviate this burning issue this thesis suggests the use
of derivative free optimization techniques in place of conventional techniques.
In recent past population based optimization techniques have been reported which fall under the
category of evolutionary computing [16] or computational intelligence [17]. These are also
called bio-inspired techniques which include genetic algorithm (GA) and its variants [18],
Differential Evolution [19]. These techniques are suitably employed to obtain efficient iterative
learning algorithms for developing adaptive direct and inverse models of complex plants and
channels.
Development of direct and inverse adaptive models essentially consists of two components. The
first component is an adaptive network which may be linear or nonlinear in nature. Use of a
nonlinear network is preferable when nonlinear plants or channels are to be identified or
equalized. The linear networks used in the thesis are adaptive linear combiner or all-zero or FIR
structure [7] under nonlinear category GA and DE are used.
1.2 MOTIVATION
In summary the main motivations of the research work carried in the present thesis are the
following:
i. To formulate the direct and inverse modeling problems as error square optimization
problems
ii. To introduce bio-inspired optimization tools such as GA and DE and their variants to
efficiently minimize the squared error cost function of the models. In other words to
develop alternate identification scheme.
iii. To achieve improved identification (direct modeling) of complex nonlinear and channel
equalization (inverse modeling) of nonlinear noisy digital channels by introducing new
and improved updating algorithms.
[2]
1.3 MAJOR CONTRIBUTION OF THE THESIS
The major contribution of the thesis is outlined below
i. The GA based approach for both linear and nonlinear system identifications are
introduced. The GA based approach is found to be more efficient for nonlinear system
than other standard derivative based learning. In addition the DE based identification
have been proposed and shown to have better performance and involve less
computational complexity.
ii. The GA based approach for linear and nonlinear channel equalizations are introduced.
The GA based approach is found to be more efficient than other standard derivative
based learning. In addition DE based equalizers have been proposed and shown to have
better performance and involve less computational complexity.
1.4 CHAPTER WISE CONTRIBUTION
The research work undertaken is embodied in 7 Chapters.
• Chapter-1 gives an introduction to System identification, channel equalization and reviews
of various learning algorithm such as Least-mean-square (LMS) algorithm, Recursive-
least-square (RLS) algorithm, Artificial Neural Network (ANN), Genetic Algorithm (GA),
Differential Evolution (DE) used to identify the system and train to the equalizer. It also
includes the motivation behind undertaking the thesis work.
• Chapter-2 Discusses about the general form of adaptive algorithm, Adaptive filtering
problem, derivative based algorithm such as LMS and overview of derivative free based
algorithm such as Genetic Algorithm and Differential Evolution.
• Chapter-3 Discusses various system identification technique, Develop the algorithm of GA
for simulation on system identification and taking a comparison study between LMS and
GA on both linear and nonlinear system.
[3]
• Chapter-4 Discusses various channel equalization technique, Develop the algorithm of GA
for simulation on channel equalization and taking a comparison between LMS and GA on
both linear and nonlinear channel.
• Chapter-5 Develop the algorithm of DE for simulation on system identification and taking
a comparison between LMS, GA and DE on both linear and nonlinear system.
• Chapter-6 Develop the algorithm of DE for simulation on channel equalization and taking a
comparison between LMS, GA and DE on both linear and nonlinear channel equalizers.
• Chapter-7 deals with the conclusion of the investigation made in the thesis. This chapter
also suggests some future research related to the topic.

[4]
CHAPTER- 2
GENETIC ALGORITHM AND
DIFFERENTIAL EVOLUTION
2.1 INTRODUCTION:
There are many learning algorithms which are employed to train various adaptive models. The
performance of these models depends on rate of convergence, training time, Computational
complexity involved and minimum mean square error achieved after training. The learning
algorithms may be broadly classified into two categories (a) derivative based (b) derivative free.
The derivative based algorithms include least means square (LMS), IIR LMS (ILMS), back
propagation (BP) and FLANN-LMS. Under the derivative free algorithms, genetic algorithm
(GA), differential evolution (DE), particle swarm optimization (PSO), bacterial foraging
optimization (BFO) and artificial immune system (AIS) have been employed. In this section the
details of LMS, GA and DE algorithms are outlined in sequel.
2.2 GRADIENT BASED ADAPTIVE ALGORITHIM:
An adaptive algorithm is a procedure for adjusting the parameters of an adaptive filter to
minimize a cost function chosen for the task at hand. In this section, we describe the general
form of many adaptive FIR filtering algorithms and present a simple derivation of the LMS
adaptive algorithm. In our discussion, we only consider an adaptive FIR filter structure. Such
systems are currently more popular than adaptive IIR filters because
1. The input-output stability of the FIR filter structure is guaranteed for any set of fixed
coefficients, and
2. The algorithms for adjusting the coefficients of FIR filters are simpler in general than
those for adjusting the coefficients of IIR filters.
[5]
2.2.1 GENERAL FORM OF ADAPTIVE FIR ALGORITHMS:
The general form of an adaptive FIR filtering algorithm is

W(n+1)=W(n) + µ(n)G(e(n),X(n),Φ(n)) (2.1)
Where G(-) is a particular vector-valued nonlinear function, µ(n) is a step size parameter, e(n)
and X(n) are the error signal and input signal vector, respectively, and Φ(n) is a vector of states
that store pertinent information about the characteristics of the input and error signals and/or the
coefficients at previous time instants. In the simplest algorithms, Φ(n) is not used, and the only
information needed to adjust the coefficients at time n are the error signal, input signal vector,
and step size.
The step size is so called because it determines the magnitude of the change or ”step” that is
taken by the algorithm in iteratively determining a useful coefficient vector. Much research
effort has been spent characterizing the role that µ (n) plays in the performance of adaptive
filters in terms of the statistical or frequency characteristics of the input and desired response
signals. Often, success or failure of an adaptive filtering application depends on how the value
of µ(n) is chosen or calculated to obtain the best performance from the adaptive filter.
2.2.2 THE MEAN-SQUARED ERROR COST FUNCTION:
The form of G(-) depends on the cost function chosen for the given adaptive filtering task. We
now consider one particular cost function that yields a popular adaptive algorithm. Define the
mean-squared error (MSE) cost function as

( )
( )
2
1
( ) ( ) ( )
2
mse n
J n e n p e n de n
+∞
−∞
·

(2.2)
=
( )
2
1
2
Ee n (2.3)
[6]
Where p
n
(e) represents the probability density function of the error at time n. and E- is short
hand for the expectation integral on the right hand side (2.3).
The MSE cost function is useful for adaptive FIR filters because
• J
mse
(n) has a well-defined minimum with respect to the parameters in W(n)·
• The coefficient values obtained at this minimum are the ones that minimize the power in
the error signal e(n), indicating that y(n) has approached d(n) and
• J
mse
(n) a smooth function of each of the parameters in W(n), such that it is differentiable
with respect to each of the parameters in W(n).
The third point is important in that it enables us to determine both the optimum coefficient
values given knowledge of the statistics of d(n) and x(w) as well as a simple iterative procedure
for adjusting the parameters of an FIR filter.
2.2.3 THE WIENER SOLUTION:
For the FIR filter structure, the coefficient values in W(n) that minimize J
M SE
(n) are well-
defined if the statistics of the input and desired response signals are known. The formulation of
this problem for continuous-time signals and the resulting solution was first derived by Wiener.
Hence, this optimum coefficient vector W
M SE
(n) is often called the Wiener solution to the
adaptive filtering problem. The extension of Wiener’s analysis to the discrete-time case is
attributed to Levinson . To determine W
M SE
(n) we note that the function J
M SE
(n) in is quadratic
in the parameters {w
i
(n)}, and the function is also differentiable. Thus, we can use a result from
optimization theory that states that the derivatives of a smooth cost function with respect to each
of the parameters is zero at a minimizing point on the cost function error surface. Thus, W
M SE
(n) can be found from the solution to the system of equation

( )
0.0 ( ) ( )
( )
mse
i
J n
i L
w n

· ≤ ≤

(2.4)
Taking derivative of J
mse
(n) in (2.3) and noting that e(n)=d(n) - y(n) and
( ) ( )
1
0
( ) ( ) ( )
T
L
i
i
w n X n y n w n x n i

·
· · −

respectively, we obtain
[7]
( ) ( ( ))
( )
( ) ( )
mse
i i
J n e n
E e n
w n w n
1
1
1
¸ ]
∂ ∂
·
∂ ∂
(2.5)
( )
( )
( )
i
y n
Ee n
w n
1
1
1
¸ ]

· −

(2.6)
[ ( ) ( )] E e n x n i ·− − (2.7)

1
0
( ) ( ) ( ) ( ) ( )
L
j
j
E d n x n i E x n i x n j w n

·
¸ _
1
1
1 ¸ ]
¸ ]

¸ ,
· − − − − −

(2.8)
Where we have used the dentitions of e(n) and of y(n) for the FIR filter structure in and
respectively to expand the last result in (2.8).
By defining the matrix R
XX
(n) and vector P
dx
(n) as
( ) ( )
T
XX
R E X n X n
1
1
¸ ]
·
(2.9)
( ) ( ) ( )
dx
P n E d n X n
1
1
¸ ]
·
(2.10)
respectively, we can combine the above equations to obtain the system of equations in vector
form as
( ) ( ) ( ) 0
XX MSE dx
R n W n P n − ·
(2.11)
Where 0 is the zero vector. Thus, so long as the matrix R
XX
(n) is invertible, the optimum
Wiener solution vector for this problem is
[8]
1
( ) ( ) ( )
XX MSE dx
W n R n P n

· (2.12)
2.2.4 THE METHOD OF STEEPEST DESCENT:
The method of steepest descent is a celebrated optimization procedure for minimizing the value
of a cost function J(n) with respect to a set of adjustable pa-rameters W(n). This procedure
adjusts each parameter of the system according to

( )
( 1) ( ) ( )
( )
i i
i
J n
w n w n n
w n
µ

+ · −

(2.13)
In other words, the I
th
parameter of the system is altered according to the derivative of the cost
function with respect to the I
th
parameter. Collecting these equations in vector form, we have

( )
( 1) ( ) ( )
( )
J n
W n W n n
W n
µ

+ · −

(2.14)
Where
( )
( )
J n
W n


be the vector form of
( )
( )
i
J n
w n


For an FIR adaptive filter that minimizes the MSE cost function, we can use the result in to
explicitly give the form of the steepest descent procedure in this problem. Substituting these
results into yields the update equation for W(n) as
( )
( 1) ( ) ( ) ( ) ( ) ( )
XX dx
W n W n n P n R n W n µ + · + −
(2.15)

However, this steepest descent procedure depends on the statistical quantities E{d(n)x(n − i)}
and E{x(n − i)x(n − j)} contained in P
dx
(n) and R
xx
(n) respectively. In practice, we only have
measurements of both d(n) and x(n) to be used within the adaptation procedure. While suitable
estimates of the statistical quantities needed for (2.15) could be determined from the signals x(n)
and d(n) we instead develop an approximate version of the method of steepest descent that
depends on the signal values themselves. This procedure is known as the LMS algorithm.
[9]
2.2.5 THE LMS ALGORITHM:
The cost function J(n) chosen for the steepest descent algorithm of determines the coefficient
solution obtained by the adaptive filter. If the MSE cost function in is chosen, the resulting
algorithm depends on the statistics of x(n) and d(n) because of the expectation operation that
defines this cost function. Since we typically only have measurements of d(n) and of x(n)
available to us, we substitute an alternative cost function that depends only on these
measurements.
One such cost function is the least-squares cost function given by

( )
2
0
( ) ( ) ( ) ( ) ( )
n
T
LMS
k
j n k d k W n X k α
·
· −

(2.16)
Where ( ) k a
is a suitable weighting sequence for the terms within the summation. This cost
function, however, is complicated by the fact that it requires numerous computations to
calculate its value as well as its derivatives with respect to each W(n), although efficient
recursive methods for its minimization can be developed. Alternatively, we can propose the
simplified cost function J
LM S
(n ) Given by

2
1
( ) ( )
2
LMS
J n e n · (2.17)
function can be thought of as an instantaneous estimate of the MSE cost function, as
J
MSE
(n)=EJ
LMS
(n). Although it might not appear to be useful, the resulting algorithm obtained
when J
LMS
(n) is used for J(n) in (2.13) is extremely useful for practical applications. Taking
derivatives of J
LMS
(n) with respect to the elements of W(n) and substituting the result into
(2.13), we obtain the LMS adaptive algorithm given by

( 1) ( ) ( ) ( ) ( ) W n W n n e n X n µ + · + (2.18)
[10]
Note that this algorithm is of the general form in. It also requires only multiplications and
additions to implement. In fact, the number and type of operations needed for the LMS
algorithm is nearly the same as that of the FIR filter structure with fixed coefficient values,
which is one of the reasons for the algorithm’s popularity. The behavior of the LMS algorithm
has been widely studied, and numerous results concerning its adaptation characteristics under
different situations have been developed. For now, we indicate its useful behavior by noting that
the solution obtained by the LMS algorithm near its convergent point is related to the Wiener
solution. In fact, analyses of the LMS algorithm under certain statistical assumptions about the
input and desired response signals show that

lim [ ( )]
MSE
n
E W n W
→∞
·
(2.19)
When the Wiener solution W
M SE
(n) is a fixed vector. Moreover, the average behavior of the
LMS algorithm is quite similar to that of the steepest descent algorithm in that depends
explicitly on the statistics of the input and desired response signals. In effect, the iterative nature
of the LMS coefficient updates is a form of time-averaging that smoothes the errors in the
instantaneous gradient calculations to obtain a more reasonable estimate of the true gradient.
The problem is that gradient descent is a local optimization technique, which is limited because
it is unable to converge to the global optimum on a multimodal error surface if the algorithm is
not initialized in the basin of attraction of the global optimum. Several medications' exist for
gradient based algorithms in attempt to enable them to overcome local optima. One approach is
to simply add noise or a momentum term to the gradient computation of the gradient descent
algorithm to enable it to be more likely to escape from a local minimum. This approach is only
likely to be successful when the error surface is relatively smooth with minor local minima, or
some information can be inferred about the topology of the surface such that the additional
gradient parameters can be assigned accordingly. Other approaches attempt to transform the
error surface to eliminate or diminish the presence of local minima , which would ideally result
in a unimodal error surface. The problem with these approaches is that the resulting minimum
[11]
transformed error used to update the adaptive filter can be biased from the true minimum output
error and the algorithm may not be able to converge to the desired minimum error condition.
These algorithms also tend to be complex, slow to converge, and may not be guaranteed to
emerge from a local minimum. Some work has been done with regard to removing the bias of
equation error LMS and Steiglitz-McBride adaptive IIR filters, which add further complexity
with varying degrees of success. Another approach, attempts to locate the global optimum by
running several LMS algorithms in parallel, initialized with different initial coefficients. The
notion is that a larger, concurrent sampling of the error surface will increase the likelihood that
one process will be initialized in the global optimum valley. This technique does have potential,
but it is inefficient and may still suffer the fate of a standard gradient technique in that it will be
unable to locate the global optimum if none of the initial estimates is located in the basin of
attraction of the global optimum. By using a similar congregational scheme, but one in which
information is collectively exchanged between estimates and intelligent randomization is
introduced, structured stochastic algorithms are able to hill-climb out of local minima. This
enables the algorithms to achieve better, more consistent results using a fewer number of total
estimates. These types of algorithms provide the framework for the algorithms discussed in the
following sections.
2.3 DERIVATIVE FREE BASED ALGORITHIM:
Since the beginning of the nineteenth century, a significant evolution in optimization theory has
been noticed. Classical linear programming and traditional non-linear optimization techniques
such as Lagrange’s Multiplier, Bellman’s principle and Pontyagrin’s principle were prevalent
until this century. Unfortunately, these derivative based optimization techniques can no longer
be used to determine the optima on rough non-linear surfaces. One solution to this problem has
already been put forward by the evolutionary algorithms research community. Genetic
algorithm (GA), enunciated by Holland, is one such popular algorithm. This chapter provides
recent algorithms for evolutionary optimization known as deferential evolution (DE). The
algorithms are inspired by biological and sociological motivations and can take care of
optimality on rough, discontinuous and multimodal surfaces. The chapter explores several
schemes for controlling the convergence behaviors DE by a judicious selection of their
parameters. Special emphasis is given on the hybridizations DE algorithms with other soft
computing tools.
[12]
2.4 GENETIC ALGORITHM:
Genetic algorithms are a class of evolutionary computing techniques, which is a rapidly
growing area of artificial intelligence. Genetic algorithms are inspired by Darwin's theory of
evolution. Simply said, problems are solved by an evolutionary process resulting in a best
(fittest) solution (survivor) - in other words, the solution is evolved. Evolutionary computing
was introduced in the 1960s by Rechenberg in his work "Evolution strategies" (Evolutions
strategies' in original). His idea was then developed by other researchers. Genetic Algorithms
(GAs) were invented by John Holland and developed by him and his students and colleagues .
This led to Holland's book "Adaption in Natural and Artificial Systems" published in 1975.
The algorithm begins with a set of solutions (represented by chromosomes) called population.
Solutions from one population are taken and used to form a new population.
This is motivated by a hope, that the new population will be better than the old one. Solutions
which are then selected to form new solutions (offspring) are selected according to their fitness -
the more suitable they are, the more chances they have to reproduce. This is repeated until some
condition (for example number of populations or improvement of the best solution) is satisfied.
2.4.1 OUTLINE OF BASIC GA:
1. [Start] Generate random population of n chromosomes (suitable solutions for the
problem)
2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population
3. [New population] Create a new population by repeating following steps until the new
population is complete
4. a [Selection] Select two parent chromosomes from a population according to their
fitness (the better fitness, the bigger chance to be selected)
5. [Replace] Use new generated population for a further run of the algorithm
6. [Test] If the end condition is satisfied, stop, and return the best solution in current
7. population
8. [Loop] Go to step 2
9. The outline of the Basic GA provided above is very general. There are many parameters
[13]
and settings that can be implemented differently in various problems. Elitism is often
used as a method of selection. Which means, that at least one of a generation's best
solution is copied without changes to a new population, so the best solution can survive
to the succeeding generation
a. [Crossover] With a crossover probability cross over the parents to form new
offspring (children). If no crossover was performed, offspring is the exact copy of
parents.
b. [Mutation] With a mutation probability mutate new offspring at each locus (position
in chromosome).
c. [Accepting] Place new offspring in the new population
2.4.2 OPERATORS OF GA:
OVERVIEW:
The crossover and mutation are the most important parts of the genetic algorithm. The
performance is influenced mainly by these two operators.
ENCODING OF A CHROMOSOME:
A chromosome should in some way contain information about solution that it represents. The
most commonly used way of encoding is a binary string. A chromosome then could look like
this:
Table-2.1 (Encoding of a chromosome)
Each chromosome is represented by a binary string. Each bit in the string can represent some
characteristics of the solution. There are many other ways of encoding. The encoding depends
mainly on the problem to be solved. For example, one can encode directly integer or real
numbers; sometimes it is useful to encode some permutations and so on.
CROSSOVER:
[14]
Chromosome 1 1101100100110110
Chromosome 2 1101111000011110
Crossover operates on selected genes from parent chromosomes and creates new offspring. The
simplest way how to do that is to choose randomly some crossover point and copy everything
before this point from the first parent and then copy everything after the crossover point from
the other parent. Crossover is illustrated in the following (| is the Crossover point)

Table-2.2 (crossover of Chromosome)
Chromosome 1 11011 І 00100110110
Chromosome 2 11011 І 11000011110
Chromosome 3 11011 І 11000011110
Chromosome 4 11011 І 00100110110
There are other ways how to make crossover, for example we can choose more crossover points.
MUTATION:
Mutation is intended to prevent falling of all solutions in the population into a local optimum of
the solved problem. Mutation operation randomly changes the offspring resulted from
crossover. In case of binary encoding we can switch a few randomly chosen bits from 1 to 0 or
from 0 to 1. Mutation can be then illustrated as follows



Table-2.3(Mutation operation)
[15]
Original offspring 1 1101111000011110
Original offspring 2 1101100100110110
Original offspring 3 1100111000011110
Original offspring 4 1101101100110110
The technique of mutation (as well as crossover) depends mainly on the encoding of
chromosomes. For example when we are encoding by permutations, mutation could be
performed as an exchange of two genes.
2.4.3 PARAMETERS OF GA:
There are two basic parameters of GA - crossover probability and mutation probability.
CROSSOVER PROBABILITY:
It indicates how often crossover will be performed. If there is no crossover, offspring are exact
copies of parents. If there is crossover, offspring are made from parts of both parent's
chromosome. If crossover probability is 100%, then all
offspring are made by crossover. If it is 0%, whole new generation is made from exact copies of
chromosomes from old population (but this does not mean that the new generation is the same!).
Crossover is made in hope that new chromosomes will contain good parts of old chromosomes
and therefore the new chromosomes will be better. However, it is good to leave some part of old
population survives to next generation.
MUTATION PROBABILITY:
This signifies how often parts of chromosome will be mutated. If there is no mutation, offspring
[16]
are generated immediately after crossover (or directly copied) without any change. If mutation
is performed, one or more parts of a chromosome are changed. If mutation probability is 100%,
whole chromosome is changed, if it is 0%, nothing is changed. Mutation generally prevents the
GA from falling into local extremes. Mutation should not occur very often, because then GA
will in fact change to random search.
OTHER PARAMETERS:
There are also some other parameters of GA. One another particularly important parameter is
population size.
POPULATION SIZE:
It signifies how many chromosomes are present in population (in one generation). If there are
too few chromosomes, then GA has few possibilities to perform crossover and only a small part
of search space is explored. On the other hand, if there are too many chromosomes, then GA
slows down.
SELECTION:
The chromosomes are selected from the population to be parents for crossover. The problem is
how to select these chromosomes. According to Darwin's theory of evolution, the best ones
survive to create new offspring. There are many methods in selecting the best chromosomes.
Examples are roulette wheel selection, Boltzmann selection, tournament selection, rank
selection, steady state selection and some others. In this thesis we have used the tournament
selection as it performs better than the others.
TOURNAMENT SELECTION:
A selection strategy in GA is simply a process that favors the selection of better individuals in
the population for the mating pool. There are two important issues in the evolution process of
genetic search, population diversity and selective pressure. Population diversity means that the
genes from the already discovered good individuals are exploited while promising the new areas
of the search space continue to be explored. Selective pressure is the degree to which the better
individuals are favored. The tournament selection strategy provides selective pressure by
[17]
holding a tournament competition among individuals.
2.5 DIFFERENTIAL EVALUATION:
The aim of optimization is to determine the best-suited solution to a problem under a given set
of constraints. Several researchers over the decades have come up with different solutions to
linear and non-linear optimization problems. Mathematically an optimization problem involves
a fitness function describing the problem, under a set of constraints representing the solution
space for the problem. Unfortunately, most of the traditional optimization techniques are
centered around evaluating the first derivatives to locate the optima on a given constrained
surface. Because of the difficulties in evaluating the first Derivatives, to locate the optima for
many rough and discontinuous optimization surfaces, in recent times, several derivative free
optimization algorithms have emerged. The optimization problem, now-a-days, is represented as
an intelligent search problem, where one or more agents are employed to determine the optima
on a search landscape, representing the constrained surface for the optimization problem [20].
In the later quarter of the twentieth century, Holland pioneered a new concept on evolutionary
search algorithms, and came up with a solution to the so far open-ended problem to non-linear
optimization problems. Inspired by the natural adaptations of the biological species, Holland
echoed the Darwinian Theory through his most popular and well known algorithm, currently
known as genetic algorithms (GA) [21]. Holland and his coworkers including Goldberg and
Dejong popularized the theory of GA and demonstrated how biological crossovers and
mutations of chromosomes can be realized in the algorithm to improve the quality of the
solutions over successive iterations [22]. In mid 1990s Eberhart and Kennedy enunciated an
alternative solution to the complex non-linear optimization problem by emulating the collective
behavior of bird flocks, particles, the boids method of Craig Reynolds [23] and socio-cognition
and called their brainchild the particle swarm optimization (PSO)[23-27]. Around the same
time, Price and Storn took a serious attempt to replace the classical crossover and mutation
operators in GA by alternative operators, and consequently came up with a suitable deferential
operator to handle the problem. They proposed a new algorithm based on this operator, and
called it deferential evolution (DE) [28].
Both algorithms do not require any gradient information of the function to be optimized uses
only primitive mathematical operators and are conceptually very simple. They can be
[18]
implemented in any computer language very easily and requires minimal parameter tuning.
Algorithm performance does not deteriorate severely with the growth of the search space
dimensions as well. These issues perhaps have a great role in the popularity of the algorithms
within the domain of machine intelligence and cybernetics.
2.5.1 CLASSICAL DE:
Like any other evolutionary algorithm, DE also starts with a population of NP D-dimensional
search variable vectors. We will represent subsequent generations in DE by discrete time steps
like t = 0, 1, 2. . . t, t+1, etc. Since the vectors are likely to be changed over different generations
we may adopt the following notation for representing the ith vector of the population at the
current generation (i.e., at time t = t) as
X
i
(t)= [x
i,1
(t), x
i,2
(t), x
i,3
(t) . . . . . x
i,D
(t)] (2.20)
These vectors are referred in literature as “genomes” or “chromosomes”. DE is a very simple
evolutionary algorithm. For each search-variable, there may be a certain range within which
value of the parameter should lie for better search results. At the very beginning of a DE run or
at t = 0, problem parameters or independent variables are Initialized somewhere in their feasible
numerical range. Therefore, if the jth parameter of the given problem has its lower and upper
bound as x
Lj
and x
Uj
respectively, then we may initialize the j
th
component of the i
th
population
members as x
i,j
(0) = x
Lj
+ rand (0, 1) · (x
Uj
− x
Lj
), where rand (0,1) is a uniformly distributed
random number lying between 0 and 1. Now in each generation (or one iteration of the
algorithm) to change each population member X
i
(t) (say), a Donor vector V
i
(t) is created. It is
the method of creating this donor vector, which demarcates between the various DE schemes.
However, here we discuss one such specific mutation strategy known as DE/rand/1. In this
scheme, to create V
i
(t) for each ith member, three other parameter vectors (say the r
1
, r
2
, and r
3
th
vectors) are chosen in a random fashion from the current population. Next, a scalar number F
scales the deference of any two of the three vectors and the scaled deference is added to the
third one whence we obtain the donor vector V
i
(t). We can express the process for the j
th
component of each vector as
[19]
, 1, 2, 3,
. ( 1) ( ) .( ( ) ( ))..............
i j r j r j r j
V t x t F x t x t + · + −
(2.21)
The process is illustrated in Fig. 2. Closed curves in Fig. 2denote constant cost contours, i.e., for
a given cost function f, a contour corresponds to f (X) = constant. Here the constant cost
contours are drawn for the Ackley Function. Next, to increase the potential diversity of the
population a crossover scheme comes to play. DE can use two kinds of cross over schemes
namely “Exponential” and “Binomial”. The donor vector exchanges its “body parts”, i.e.,
components with the target vector Xi(t) under this scheme. In “Exponential” crossover, we first
choose an integer n randomly among the numbers [0, D–1]. This integer acts as starting point in
the target vector, from where the crossover or exchange of components with the donor vector
starts. We also choose another integer L from the interval [1, D]. L denotes the number of
components; the donor vector actually contributes to the target. After a choice of n and L the
trial vector
,1 ,2 ,
( ) [ ( ), ( ),....... ( )]
i i i i D
U t u t u t u t ·
(2.22)
is formed with
, ,
( ) ( )
i j i j
u t v t ·
for j= < n > D, < n+1 > D,……..,< n –L+1 >D

·
( )
ij
x t

(2.23)
Where the angular brackets <>D denote a modulo function with modulus D. The integer L is
drawn from [1, D] according to the following pseudo code.
[20]
Fig. 1.1. Illustrating creation of the donor vector in 2-D parameter space (The
constant cost contours are for two-dimensional Ackley Function)
L=0;
Do
{
L=L+1;
}
While (rand (0, 1) < CR) AND (L<D);
Hence in effect probability (L > m) = (CR)
m−1
for any m > 0. CR is called “Crossover” constant
and it appears as a control parameter of DE just like F. For each donor vector V, a new set of n
and L must be chosen randomly as shown above. However, in “Binomial” crossover scheme,
the crossover is performed on each of the D variables whenever a randomly picked number
between 0 and 1 is within the CR value. The scheme may be outlined as
u
i,j
(t) = v
i,j
(t) if rand (0, 1) < CR,
[21]
= x
i,j
(t) else………. (2.26)
In this way for each trial vector X
i
(t) an offspring vector U
i
(t) is created. To keep the population
size constant over subsequent generations, the next step of the algorithm calls for “selection” to
determine which one of the target vector and the trial vector will survive in the next generation,
i.e., at time t = t + 1. DE actually involves the Darwinian principle of “Survival of the fittest” in
its selection process which may be outlined as
X
i
(t + 1) =U
i
(t) if f (U
i
(t)) ≤ f (X
i
(t)),

= X
i
(t) if f (U
i
(t)) ≤ f (X
i
(t)) (2.27)

Where f () is the function to be minimized. So if the new trial vector yields a better value of the
fitness function, it replaces its target in the next generation; otherwise the target vector is
retained in the population. Hence the population either gets better (w.r.t. the fitness function) or
remains constant but never deteriorates. The DE/rand/1 algorithm is outlined below
2.5.2 PROCEDURE:
Input: Randomly initialized position and velocity of the particles: xi(0)
Output: Position of the approximate global optima X ∗
Begin
Initialize population;
Evaluate fitness;
For i = 0 to max-iteration do
Begin
Create Difference-Offspring;
Evaluate fitness;
If an offspring is better than its parent
Then replace the parent by offspring in the next generation;
End If;
End For;
End.
[22]
2.5.3 THE COMPLETE DE FAMILY:
Actually, it is the process of mutation, which demarcates one DE scheme from another. In the
former section, we have illustrated the basic steps of a simple DE. The mutation scheme in
(2.21) uses a randomly selected vector Xr1 and only one weighted difference vector F · (Xr2 −
Xr3) is used to perturb it. Hence, in literature the particular mutation scheme is referred to as
DE/rand/1. We can now have an idea of how different DE schemes are named. The general
convention used, is DE/x/y. DE stands for DE, x represents a string denoting the type of the
vector to be perturbed (whether it is randomly selected or it is the best vector in the population
with respect to fitness value) and y is the
number of difference vectors considered for perturbation of x. Below we outline the other four
different mutation schemes, suggested by Price et al.
SCHEME DE/RAND TO BEST/1
DE/rand to best/1 follows the same procedure as that of the simple DE scheme illustrated
earlier. The only difference being that, now the donor vector, used to perturb each population
member, is created using any two randomly selected member of the population as well as the
best vector of the current generation (i.e., the vector yielding best suited objective function
value at t = t). This can be expressed for the ith donor vector at time t = t + 1 as

V
i
(t + 1) = X
i
(t) + λ · (X
best
(t) − X
i
(t)) + F · (X
r2
(t) − X
r3
(t)) (2.28)
Where λ is another control parameter of DE in [0, 2], X
i
(t) is the target vector and X
best
(t) is the
best member of the population regarding fitness at current time step t = t. To reduce the number
of control parameters a usual choice is to put λ = F
SCHEME DE/BEST/1
In this scheme everything is identical to DE/rand/1 except the fact that the
trial vector is formed as
V
i
(t + 1) = X
best
(t) + F · (X
r1
(t) − X
r2
(t)) (2.29)
[23]
here the vector to be perturbed is the best vector of the current population and the perturbation is
caused by using a single difference vector.
SCHEME DE/BEST/2
Under this method, the donor vector is formed by using two difference vectors as shown below:
V
i
(t + 1) = X
best
(t) + F · (X
r1
(t) + X
r2
(t) − X
r3
(t) − X
r4
(t)) (2.30)
Owing to the central limit theorem the random variations in the parameter vector seems to shift
slightly into the Gaussian direction which seems to be beneficial for many functions.
SCHEME DE/RAND/2
Here the vector to be perturbed is selected randomly and two weighted difference vectors are
added to the same to produce the donor vector. Thus for each target vector, a totality of five
other distinct vectors are selected from the rest of the population. The process can be expressed
in the form of an equation as
V
i
(t + 1) = X
r1
(t) + F
1
· (X
r2
(t) − X
r3
(t)) + F
2
· (X
r4
(t) − X (t)) (2.31)
Here F1 and F2 are two weighing factors selected in the range from 0 to 1. To reduce the
number of parameters we may choose F
1
= F
2
= F.
SUMMARY OF ALL SCHEMES:
In 2001 Storn and Price [21] suggested total ten different working strategies of DE and some
guidelines in applying these strategies to any given problem. These strategies were derived from
the five different DE mutation schemes outlined above. Each mutation strategy was combined
with either the “exponential” type crossover or the “binomial” type crossover. This yielded 5 ×
2 = 10 DE strategies, which are listed below.
DE/best/1/exp
[24]
DE/rand/1/exp
DE/rand-to-best/1/exp
DE/best/2/exp
DE/rand/2/exp
DE/best/1/bin
DE/rand/1/bin
DE/rand-to-best/1/bin
DE/best/2/bin
DE/rand/2/
The general convention used above is again DE/x/y/z, where DE stands for DE, x represents a
string denoting the vector to be perturbed, y is the number of difference vectors considered for
perturbation of x, and z stands for the type of crossover being used (exp: exponential; bin:
binomial)
2.5.4 MORE RECENT VARIANTS OF DE:
DE is a stochastic, population-based, evolutionary search algorithm. The strength of the
algorithm lies in its simplicity, speed (how fast an algorithm can find the optimal or suboptimal
points of the search space) and robustness (producing nearly same results over repeated runs).
The rate of convergence of DE as well as its accuracy can be improved largely by applying
different mutation and selection strategies. A judicious control of the two key parameters
namely the scale factor F and the crossover rate CR can considerably alter the performance of
DE. In what follows we will illustrate some recent medications in DE to make it suitable for
tackling the most difficult optimization problems.
DE WITH TRIGONOMETRIC MUTATION:
Recently, Lampinen and Fan [29] has proposed a trigonometric mutation operator for DE to
speed up its performance. To implement the scheme, for each target vector, three distinct
vectors are randomly selected from the DE population. Suppose for the ith target vector Xi(t),
the selected population members are X
r1
(t), X
r2
(t) and X
r3
(t). The indices r
1
, r
2
and r
3
are mutually
different and selected from [1, 2. . . N] Where N denotes the population size. Suppose the
[25]
objective function values of these three vectors are given by, f (X
r1
(t)), f (X
r2
(t)) and f (X
r3
(t)).
Now three weighing coefficients are formed according to the following equations:
p = f (X
r1
) + f (X
r2
) + f (X
r3
) (2.32)
p
1
= f (X
r1
) p (2.33)
p
2
= f (X
r2
) p (2.34)
p
3
= f (X
r3
) p (2.35)
Let rand (0, 1) be a uniformly distributed random number in (0, 1) and Γ be the trigonometric
mutation rate in the same interval (0, 1). The trigonometric mutation scheme may now be
expressed as
V
i
(t + 1) = (X
r1
+ X
r2
+ X
r3
)/3 + (p2 − p1) · (X
r1
− X
r2
)
+ (p
3
− p
2
) · (X
r2
− X
r3
) + (p
1
− p
3
) · (X
r3
− X
r1
)
if rand (0, 1) < Γ (2.36)
Vi(t + 1) = Xr1 + F · (Xr2 + Xr3) else (2.37)

Thus, we find that the scheme proposed by Lampinen et al. uses trigonometric mutation with a
probability of Γ and the mutation scheme of DE/rand/1 with a probability of (1 − Γ).
DERANDSF (DE WITH RANDOM SCALE FACTOR)
In the original DE [28] the deference vector (Xr1(t) − Xr2(t)) is scaled by a constant factor “F ”.
The usual choice for this control parameter is a number between 0.4 and 1. We propose to vary
this scale factor in a random manner in the range (0.5, 1) by using the relation
F = 0.5∗ (1 + rand (0, 1)) (2.38)
[26]
where rand (0, 1) is a uniformly distributed random number within the range [0, 1]. We call this
scheme DERANDSF (DE with Random Scale Factor) . The mean value of the scale factor is
0.75. This allows for stochastic variations in the amplification of the difference vector and thus
helps retain population diversity as the search progresses. Even when the tips of most of the
population vectors point to locations clustered near a local optimum due to the randomly scaled
difference vector, a new trial vector has fair chances of pointing at an even better location on the
multimodal functional surface. Therefore, the
fitness of the best vector in a population is much less likely to get stagnant until a truly global
optimum is reached.
DETVSF (DE WITH TIME VARYING SCALE FACTOR)
In most population-based optimization methods (except perhaps some hybrid global-local
methods) it is generally believed to be a good idea to encourage
Fig. 1.2. Illustrating DETVSF scheme on two-dimensional cost contours of Ackley
Function
the individuals (here, the tips of the trial vectors) to sample diverse zones of the search space
during the early stages of the search. During the later stages it is important to adjust the
movements of trial solutions finely so that they can explore the interior of a relatively small
space in which the suspected global optimum lies. To meet this objective we reduce the value of
the scale factor linearly with time from a (predetermined) maximum to a (predetermined)
[27]
minimum value:
R = (R
max
− R
min
)∗(MAXIT − iter)/MAXIT (2.39)
where F
max
and F
min
are the maximum and minimum values of scale factor F, iter is the current
iteration number and MAXIT is the maximum number of allowable iterations. The locus of the
tip of the best vector in the population under this scheme may be illustrated as in Fig. 2. The
resulting algorithm is referred as DETVSF (DE with a time varying scale factor).
DE WITH LOCAL NEIGHBORHOOD:
Only in 2006, a new DE-variant, based on the neighborhood topology of the parameter vectors
was developed [30] to overcome some of the disadvantages of the classical DE versions. The
authors in proposed a neighborhood-based local mutation operator that draws inspiration from
PSO. Suppose we have a DE population P = [X
1
, X
2
. . . X
Np
] where each Xi (i = 1, 2. . . Np) is a
D-dimensional vector. Now for every vector Xi we define a neighborhood of radius k, consisting
of vectors X
i−k
. . . X
i
. . .Xi
+k
. We assume the vectors to be organized in a circular fashion such
that two immediate neighbors of vector X1 are XNp and X2. For each member of the population
a local mutation is created by employing the fittest vector in the neighborhood of the model may
be expressed as:
L
i
(t)=X
i
(t)+ λ · (X
nbest
(t) − X
i
(t)) + F · (X
p
(t) − X
q
(t)) (2.40)
where the subscript nbest indicates the best vector in the neighborhood of X i and p, q ∈ (i − k,
i + k). Apart from this, we also use a global mutation expressed as:
G
i
(t) = X
i
(t) + λ · (X
best
(t) − X
i
(t)) + F · (X
r
(t) − X
s
(t)) (2.41)
where the subscript best indicates the best vector in the entire population, and r, s ∈ (1, NP).
Global mutation encourages exploitation, since all members (vectors) of a population are biased
by the same individual (the population best); local mutation, in contrast, favors exploration,
since in general different members of the population are likely to be biased by different
individuals. Now we combine these two models using a time-varying scalar weight w ∈ (0, 1)
[28]
to form the actual mutation of the new DE as a weighted mean of the local and the global
components:
V
i
(t) = w · G
i
(t) + (1 − w) · L
i
(t). (2.42)
The weight factor varies linearly with time as follows:
w = w
min
+ (w
max
− w
min
) ·iter (2.43)

Where iter is the current iteration number, MAXIT is the maximum number of iterations
allowed and w
max
, w
min
denotes, respectively, the maximum and minimum value of the weight,
with wmax, wmin ∈ (0, 1). Thus the algorithm starts at iter = 0 with w = wmin but as iter
increases towards MAXIT, w increases gradually and ultimately when iter = MAXIT w reaches
wmax. Therefore at the beginning, emphasis is laid on the local mutation scheme, but with time,
contribution from the global model increases. In the local model attraction towards a single
point of the search space is reduced, helping DE avoid local optima. This feature is essential at
the beginning of the search process when the candidate vectors are expected to explore the
search space vigorously. Clearly, a judicious choice of wmax and wmin is necessary to strike a
balance between the exploration and exploitation abilities of the algorithm. After some
experimenting, it was found that wmax = 0.8 and wmin = 0.4 seem to improve the performance
of the algorithm over a number of benchmark function


[29]
CHAPTER -3
ADAPTIVE SYSTEM IDENTIFICATION
USING GA
3.1 INTRODUCTION:
Generally the identification of linear system is performed by using LMS algorithm. But most of
the dynamic systems exhibit nonlinearity. The LMS based technique [31] does not perform
satisfactory to identify nonlinear system. To improve the identification performance of
nonlinear systems various techniques such as Artificial Neural Network (ANN) [32], Functional
Link Artificial Neural Network (FLANN) [33], Radial Basis Function (RBF) [34], etc.
In this chapter we propose a novel adaptive model based on GA technique for identification of
nonlinear systems. To apply GAs in systems identification, each individual in the population
must represent a model of the plant and the objective becomes a quality measure of the model,
by evaluating its capacity of predicting the evolution of the measured outputs. The measured
output predictions, inherent to each individual i, is compared with the measurements made on
the real plant. The obtained error is a function of the individual’s quality. As less is this error, as
more performing the individual is. There are many ways in which the GAs can be used to solve
system identification tasks.
3.2. BASIC PRINCIPLE OF ADAPTIVE SYSTEM
IDENTIFICATION:
An adaptive filter can be used in modeling that is, imitating the behavior of physical dynamic
systems which may be regarded as unknown “black boxes” having one or more inputs and
outputs. Modeling a single input, single output dynamic system is shown in fig(3).Noise is taken
into consideration because in many practical cases the system to be modeled is noisy, that is,
has internal random disturbing forces. Internal system noise appears at the system output and is
commonly represented there as an additive noise. This noise is generally uncorrelated with the
plant input. If this is the case and if the adaptive model is an adaptive linear combiner whose
[30]
weights are adjusted to minimize mean square error, it can be shown that the least squares
solution will be unaffected by the presence of plant noise. This is not to say that the
convergence of the adaptive process will be unaffected by system noise, only that the expected
weight vector of the adaptive model after convergence will be unaffected. The least square
solution will be determined primarily by the impulse response of the system to be modeled. It
could also be significantly affected by the statistical or spectral character of the system input
signal.







Fig.3.1 Modeling the single input, single output System..
The problem of determining a mathematical model for an unknown system by observing its
input-output data is known as system identification, Which is performed by suitably
adjusting the parameters within a given model, such that for a particular input, the model output
matches with the corresponding actual system output .After a system is identified, the output
can be predicted for a given input to the system which is the goal of system identification
problem. When the plant behavior is completely unknown it may be characterized using certain
adaptive model and then its identification task is carried out using adaptive algorithms like the
[31]
Adaptive model
Unknown System
Σ
Σ
Adaptive Algorithm
+
-
x
ŷ
noise
e
y
LMS. The system identification task is at the heart of numerous adaptive filtering applications.
We list several of these applications here.
• Channel Identification
• Plant Identification.
• Echo Cancellation for long distance transmission.
• Acoustic Echo Cancellation
• Adaptive Noise Cancellation.
Fig .4 represents a schematic diagram of system identification of time invariant, causal discrete
time dynamic plant The output of the plant is given by y = p(x) where x is the input which is
uniformly bounded function of time .the operator p describes the dynamic plant . The objective
of identification problem is to construct model generating an output ŷ which approximate the
plant output y when subjected to the same input x so that the squared error (e2) is minimum .

Fig.3.2 schematic block diagram of a GA based adaptive identification system
In this chapter the modeling is done in an adaptive manner such that after training the model
iteratively y and ŷ become almost equal and the squared error becomes almost zero. The
minimization of error in an iterative manner is usually achieved by LMS or RLS methods which
[32]
y
-
x
ŷ
noise
e
+
Adaptive model
System P(x)
Σ
Σ
GA Based Adaptive
Algorithm
are basically derivative based. The shortcoming of this method is that for certain type of plant
the squared error cannot be optimally minimized due to error surface falling to local minima. In
this chapter we propose a novel and elegant method which employs Genetic algorithm for
minimizing the squared error in a derivative free manner. In essence, in this chapter the system
identification problem is viewed as a squared error minimization problem.
The adaptive modeling constitutes two step. In the first step the model is trained using GA
based updating technique. After successful training of the model system performance is carried
out by feeding zero mean uniformly distributed random input. Before we proceed to the
identification task using GA let us discuss the basics of GA based optimization.
3.3. DEVELOPMENT OF GA BASED ALGORITHEM
FOR SYSTEM IDENTIFICATION:
Referring to Fig.3.2 let the system p(x) be an FIR system represent by the transfer function
given by
p(z)=a
0
+a
1
z
-1
+a
2
z
-2
+a
3
z
-3
+……………….+a
n
z
-n
(3.1)
Where a
0
, a
1
, a
2
…an represent the impulse response (parameter) of the system . The
measurement noise of the system is given by n(k) which is assumed to be white and Gaussian
distributed . The input x is also uniformly distributed white noise lying between -2√3 to +2√3
and have a variance of unity. The GA based model consists of an equal order FIR system with
unknown coefficients. The purpose of the adaptive identification model is to estimate the
unknown coefficients â
0

1

2
,……………...â
n
such that they match with the corresponding
parameters a
0
, a
1
,a
2
,………………a
n
of the actual system p(z) . if the system is exactly
identified(theoretically) then in case of a linear system (for example the FIR system ) the system
parameters and the model parameters become equal i.e. a
0

0
, a
1

1
, a
2

2
……………a
n

n.
Also the response of actual system(y) coincides with the response of the model system (ŷ).
However, in case of nonlinear dynamic system the system parameters do not match but the
responses of the system will match.
The updating of the parameters of the model is carried out using GA rule as outlined in the
following steps
[33]
I. As shown in fig.3.2 an unknown static dynamic system to be identified is connected is
parallel with an adaptive model to be developed using GA.
II. The coefficients (â) of the system are initially chosen from population of M
chromosomes. Each chromosome constitutes NL number of random binary bits, each
sequential group of L-bits represent one coefficient of the adaptive model, where N is
the number of parameters of the model.
III. Generate k(=500) number of input signal samples each of which is having zero mean
and uniformly distributed between -2√3 to +2√3 and having a variance of unity.
IV. Each of the input samples is passed through the plant P(Z) and the contaminated with
the additive noise of known strength .The resultant signal acts like the desired signal . in
this way k number of desired signals are produced by feeding all the k input samples.
V. Each of the input sample is also passed through the model using each chromosome as
model parameters and M sets of K estimated output are obtained.
VI. Each of the desired output is compared with corresponding estimated output and K
errors are produced. The mean square error (MSE) for set of parameters (corresponding
to m
th
chromosome) is determined by using relation.

1
2
( )
k
i
i
MSEn
k
e
=
=
å
(3.2)
This is repeated for M times
VII. Since the objective is to minimize MSE(m),m=1 to M the GA based optimization is
used.
VIII. The tournament selection, crossover, mutation and selection operator are sequentially
carried out following the steps as given in section-3.3.
[34]
IX. In each generation the minimum MSE, MMSE is obtained and plotted against
generation to show the learning characteristics.
X. The learning process is stopped when MMSE reaches the minimum level.
XI. At this step all the chromosomes attend almost identical genes, which represent the
estimated parameters of the developed model.
3. 4. SIMULATION STUDIES:
To demonstrate the performance of the proposed GA based approach numerous simulation
studied are carried out of several linear and non linear system. The performance of the proposed
structure is compared with corresponding LMS structure.
The block diagram shown in the Fig.3.2is used for simulation study
Case-1 (Linear System)
A unit variance random system uniform signal lying in the range of -2√ 3 to +2√3
is applied to known the system having transfer function.
Experiment-1: H (z) =0.2090+ 0.9950Z
-1
+ 0.2090 Z
-2
and
Experiment-2: H (z) =0.2600 + 0.9300 Z
-1
+ 0.2600 Z
-2
The output of the system is contaminated with white Gaussian noise of different strengths of -20
db and -30db. The resultant signal y is used as the desired on the training signal. The same
random input is also applied to the GA based adaptive model having the same linear combiner
structure as that of H (z) but with random initial weights. The coefficients or weights of the
linear combiner are updated using LMS algorithm as well as the proposed GA based algorithm.
The training become complete when MSE in dB become parallel to x- axis. Under this
condition, for a linear system, the parameter a
i
s match with the corresponding estimated
parameter â
i
s from the proposed system.
[35]
In Table -3.1 we represent actual and estimated parameter of a 3-tap linear combiner obtained
by the LMS as well as GA models. From this table it is observed that the GA based model
performs better than that of LMS based models under different noise conditions.

Experiment
Actual
Parameter
Estimated parameters
LMS Based GA Based
NSR = -30
dB
NSR = -20
dB
NSR = -30
dB
NSR = -20
dB
01
0.2090 0.2092 0.2064 0.2100 0.2061
0.9950 0.9941 1.0094 0.9943 0.9985
0.2090 0.2071 0.2153 0.2077 0.2077
02
0.2600 0.2631 0.2705 0.2582 0.2566
0.9300 0.9308 0.9289 0.9301 0.9342
0.2600 0.2563 0.2624 0.2598 0.2598
[36]
Table-3.1 comparison of actual and estimated parameters of LMS and GA based models
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL=0
NSR=-20dB
NSR=-30dB
Fig.3.3 Learning Chacteristics of LMS based Linear System Identification (Experiment-1)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL=0
NSR=-30dB
NSR=-20dB
Fig.3.4 Learning Chacteristics of LMS based Linear System Identification (Experiment-2)
[37]
0 10 20 30 40 50 60 70 80 90 100
-35
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0
NSR=-30dB
NSR=-20dB
Fig.3.5 Learning Characteristics of GA based Linear System Identification (Experiment-1)
0 10 20 30 40 50 60 70 80 90 100
-35
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],NL=0
NSR=-20dB
NSR=-30dB
Fig.3.6 Learning Characteristics of GA based Linear System Identification (Experiment-2)
[38]
Case-2 (Non-Linear System)

In this simulation the actual is assume to be non linear in nature .Computer simulation result of
two different nonlinear system are presented in this case the actual system
Experiment -3: y
n
(k) = tanh{y (k)}
Experiment -4: y
n
(k) = y (k) + 0.2y
2
(k) – 0.1y
3
(k)
Where y (k) is the output of the linear system and y
n
(k) is the output of nonlinear system .
In case of nonlinear system the parameter of two system do not match ,however the responses of
the actual and adaptive model match .To demonstrate this observation training carried out using
both LMS and GA based algorithm .
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL:y=tanh(y)
NSR=20dB
NSR=-30dB
Fig.3.7 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-3)
[39]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y
2
)-0.1(y
3
)
NSR=-30dB
NSR=-20dB
Fig.3.8 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-4)

0 100 200 300 400 500 600
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
NSR=-30dB
NSR=-20db
Fig.3.9 Learning Chacteristics of GA based Non Linear System Identification (Experiment-3)
[40]
0 100 200 300 400 500 600
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)=0.1*(y.
3
)
NSR=-30dB
NSR=-20dB
Fig.3.10 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-4)
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t

p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
Actual
GA
LMS
Fig.3.11 Comparision of Output response of (Experiment-3) at -30dBNSR.
[41]
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t

p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
GA
LMS

Fig.3.12 Comparison of Output response of (Experiment-4) at -30dBNSR.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL:y=tanh(y)
NSR=-20dB
NSR=-30dB
Fig.3.13 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-3)
[42]

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=-30dB
NSR=-20dB

Fig.3.14 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-4)
0 100 200 300 400 500 600
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],Nl;y=tanhy
NSR=-30dB
NSR=-20dB
Fig.3.15 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-3)
[43]
0 100 200 300 400 500 600
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=-30dB
NSR=-20dB
Fig.3.16 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-4)
0 5 10 15 20 25 30 35 40 45 50
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
o
u
t

p
u
t
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
Actual
GA
LMS
Fig.3.17 Comparison of Output response of (Experiment-3) at -30dBNSR.
[44]
0 5 10 15 20 25 30 35 40 45 50
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
o
u
t

p
u
t
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
GA
LMS
Fig.3.18 Comparison of Output response of (Experiment-4) at -30dBNSR
The MSE plots of experiment-3 and experiment-4 followed by experiment-1 for two different
noise conditions for LMS based algorithm are obtained by simulation and shown in Fig.3.7
&3.8 respectively .The corresponding plots for the same system for GA based model are shown
in Fig.3.9 &3.10 respectively. The comparison of output responses of the two nonlinear models
using LMS and GA techniques are shown in Fig.3.11 &3.12 respectively. Similarly the MSE
plots of experiment-3 and experiment-4 followed by experiment-2 for two different noise
conditions for LMS based algorithm are obtained by simulation and shown in Fig.3.13 &3.14
respectively .The corresponding plots for the same system for GA based model are shown in
Fig.3.15 &3.16 respectively. The comparison of output responses of the two nonlinear models
using LMS and GA techniques are shown in Fig.3.17 &3.18 respectively. Similar results are
also observed in case of other non linear models and under various noise conditions.
[45]
3.5. RESULTS AND DISCUSSIONS:
Table-1 reveals that for FIR linear system the coefficients of adaptive model using LMS are
matched closely with the coefficients of actual system in comparison with GA.Hence for linear
FIR system LMS works well.
For nonlinear system the learning characteristics of LMS technique is poor (Fig.9) for both
noise cases. But the same is much improved in case of GA (Fig.11).
The output response of nonlinear system (Experiment-3) of GA is better than the LMS counter
part because of GA is closer to the desired response (Fig.13).
[46]
CHAPTER-4
ADAPTIVE CHANNEL EQUALIZATION
USING GENETIC ALGORITHM.
4.1 INTRODUCTION:
The digital communication system suffers from the problem of ISI which essentially deteoriates
the accuracy of reception. The probability of error at the receiver can be minimized and can be
reduced to an acceptable level by introducing an equalizer at the front end of the receiver. An
adaptive digital channel equalizer is essentially an inverse system of the channel model which
primarily compacts the effect of ISI. Conventially the LMS algorithm is employed to design and
develop adaptive equalizers [35]. Such equalizers use gradient based weight update algorithm
and therefore there is a possibility that during training of the equalizers its weight do not attain
to their optimal values due to the MSE being trapped to local minimum. On the other hand the
GA and DE are derivative free technique and hence the local minima problem does not arise
during weight updates. The present chapter has developed a novel GA based adaptive channel
equalizer.
4.2 BASIC PRINCIPLE OF CHANNEL EQUALIZATION:
In an ideal communication channel, the received information is identical to that transmitted.
However, this is not the case for real communication channels, where signal distortions take
place. A channel can interfere with the transmitted data through three types of distorting effects.
Power degradation and fades, multi-path time dispersions and background thermal noise [36].
Equalization is the process of recovering the data sequence from the corrupted channel samples.
A typical baseband transmission system is depicted in Fig.4.1, where an equalizer is
incorporated within the receiver.
[47]
Fig. 4.1. A Baseband Communication System
4.2.1 MULTIPATH PROPAGATION:
Within telecommunication channels multiple paths of propagation commonly occur. In practical
terms this is equivalent to transmitting the same signal through a number of separate channels,
each having a different attenuation and delay. Consider an open-air radio transmission channel
that has three propagation paths, as illustrated in Fig4.2. These could be direct, earth bound and
sky bound.
Multipath interference between consecutively transmitted signals will take place if one signal is
received whilst the previous signal is still being detected. In Fig4.1 this would occur if the
symbol transmission rate is greater than 1/τ where, τ represents transmission delay. Because
bandwidth efficiency leads to high data rates, multi-path interference commonly occurs.
[48]
Input
Out put Transmitter
Filter
Channel
Medium
+
Receiver
Filter
EQUALISER
noise
Fig.4.2 Impulse Response of a transmitted signal in a channel which has 3
modes of propagation, (a) The signal transmitted paths, (b) The received samples
4.2.2 MINIMUM & NON MINIMUM PHASE CHANNELS:
When all the roots of the H(Z) lie within the unit circle, the channel are termed minimum phase.
The inverse of a minimum phase [37] channel is convergent, illustrated by (4.1)
[49]
Sky Bound
Direct
Earth Bound
Transmitter Receiver
Multiple Transmission Paths
(a)
Signal Strength
at Receiver
Direct
Earth Bound
Sky Bound
τ
(b)

1
1
1.0 0.5
( )
( )
1
1
1.0 0.5
1
( )
2
0
1 2 3
1 0.5 0.25 0.125 ..............
Z
H z
H z
z
i i
Z
i
Z Z Z
ì
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
-
ï
+
ï
ï
ï
=
í
ï
ï
ï
ï
-
ï
+
ï
ï
ï ¥
ï
-
- ï
å ï
ï
ï
=
ï
ï
ï - - -
ï - + - + î
(4.1)
Whereas the inverse of non-minimum phase channels are not convergent, as shown in (4.2).
1
1
1.0 0.5
( )
( )
1.0 0.5
1
.[ ( ) ]
2
0
2 3
.[1 0.5 0.25 0.125 ]
Z
H z
H z
Z
Z
i i
Z Z
i
Z Z Z Z
ì
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï -
ï +
ï
ï
ï
=
í
ï
ï
ï
ï
+
ï
ï
ï
¥
ï
-
- ï
ï å
ï
ï
=
ï
ï
ï
ï - + -
ïî
(4.2)
Since equalizers are designed to invert the channel distortion process they will in effect model
the channel inverse. The minimum phase channel has a linear inverse model therefore a linear
equalization solution exists. However, limiting the inverse model to m-dimensions will
[50]
approximate the solution and it has been shown that non-linear solutions can provide a superior
inverse model in the same dimension.
A linear inverse of a non-minimum phase channel does not exist without incorporating time
delays. A time delay creates a convergent series for a non-minimum phase model, where longer
delays are necessary to provide a reasonable equalizer. (4.3) describes a non-minimum phase
channel with a single delay inverse and a four sample delay inverse. The latter of these is the
more suitable form for a linear filter.

1
1 1
0.5 1.0
( )
( )
1
1 0.5
1
2 3 4
1 0.5 0.25 0.125 .........
( )
3 2 1
0.5 0.25 0.125 ........
.( )
.( )
Z Z
H z
H z
Z
Z Z Z Z
H z
Z Z Z Z
noncausal
truncatedandcausal
ì
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
ï
- -
ï
+
ï
ï
ï
=
í
ï
ï
ï
ï
+
ï
ï
ï
- ï
- + - + ï
ï
ï
ï
ï
- - - ï
- + - +
ï
î
(4.3)
The three-tap non-minimum phase channel H (z) = 0.3410+0.8760z
−1
+0.3410z
−2
is used
throughout this thesis for simulation purposes. A channel delay, D, is included to assist in the
classification so that the desired output becomes u(n − D).
4.2.3 INTERSYMBOL INTERFERENCE:
Inter-symbol interference (ISI) has already been described as the overlapping of the transmitted
data. It is difficult to recover the original data from one channel sample dimension because there
is no statistical information about the multipath propagation. Increasing the dimensionality of
the channel output vector helps characterize the multipath propagation. This has the affect of not
only increasing the number of symbols but also increases the Euclidean distance between the
[51]
output classes.
Fig. 4.3 Interaction between two neighboring symbols
When additive Gaussian noise, η is present within the channel, the input sample will form
Gaussian clusters around the symbol centers. These symbol clusters can be characterized by a
probability density function (PDF) with a noise variance ση2. where the noise can cause the
symbol clusters to interfere. Once this occurs, equalization filtering will become inadequate to
classify all of the input samples. Error control coding schemes can be employed in such cases
but these often require extra bandwidth.
4.4.4 SYMBOL OVERLAP:
The expected number of errors can be calculated by considering the amount of symbol
interaction, assuming Gaussian noise. Taking any two neighboring symbols, the cumulative
distribution function (CDF) can be used to describe the overlap between the two noise
characteristics. The overlap is directly related to the probability of error between the two
symbols and if these two symbols belong to opposing classes, a class error will occur.
Fig4.3 shows two Gaussian functions that could represent two symbol noise distributions. The
Euclidean distance, L, between symbol canters and the noise variance
2
s
can be used in the
[52]
L
µ
1
µ
2
б
1
б
2
Area of overlap =
Probability of error
cumulative distribution function of (4.4) to calculate the area of overlap between the two
symbol noise distributions and therefore the probability of error, as in (4.5).

1
2
2
exp
2
2
( )
x
CDF dx x
s
s
é ù
ê ú
ê ú
ê ú
ê ú
ë û
¥
=
ò
Õ
- ¥
-
(4.4)

( ) 2
2
L
P c CDF
é ù
ê ú
ê ú
ê ú
ë û
=
(4.5)
Since each channel symbol is equally likely to occur the probability of unrecoverable errors
occurring in the equalization space can be calculated using the sum of all the CDF overlap
between each opposing class symbol. The probability of error is more commonly described as
the BER. (4.6) describes the BER based upon the Gaussian noise overlap, where NSP is the
number of symbols in the positive class, Nm is the number of number of symbols in the negative
class and
i
D
is the distance between the Ith positive symbol and its closest neighboring symbol
in the negative class.

2
( ) log ( )
2
1
N
sp
i
BER CDF
n
N N
i sp m n
s
s
é ù
ê ú
ê ú
ê ú
ê ú
ê ú
ê ú
ë û
D
=
å
+
=
(4.6)
[53]
4.3 CHANNEL EQUALIZATION:
The inverse model of a system having an unknown transfer function is itself a system having a
transfer function which is in some sense a best fit to the reciprocal of the unknown transfer
function. Sometimes the inverse model response contains a delay which is deliberately
incorporated to improve the quality of the fit. In Fig. 4.4, a source signal s(n) is fed into an
unknown system that produces the input signal x(n) for the adaptive filter. The output of the
adaptive filter is subtracted from a desired response signal that is a delayed version of the source
signal, such that
( ) ( ) n n d s = - D
Where ∆ is a positive integer value. The goal of the adaptive filter is to adjust its characteristics
such that the output signal is an accurate representation of the delayed source signal.
There are many applications of adaptive inverse model of a system. If the system is a
communication channel then the inverse model is an adaptive equalizer which compensates the
effects of inter symbol interference (ISI) caused due to restriction of channel bandwidth [38].
Similarly if this system is the models of a high density recording medium then its corresponding
inverse model reconstruct the recorded data without distortion [39]. If the system represents a
nonlinear sensor then its inverse model represents a compensator of environmental as well as
inherent nonlinearities [40]. The adaptive inverse model also finds applications in adaptive
control [41] as well as in deconvolution in geophysics application [42]
[54]
Fig. 4.4: Inverse Modeling
Channel equalization is a technique of decoding of transmitted signals across non ideal
Communication channels. The transmitter sends a sequence s(n) that is known to both the
transmitter and receiver. However, in equalization, the received signal is used as the input
Signal x(n) to an adaptive filter, which adjusts its characteristics so that its output closely
matches a delayed version
( ) n S - D
of the known transmitted signal. After a suitable
adaptation period, the coefficients of the system either are fixed and used to decode future
transmitted messages or are adapted using a crude estimate of the desired response signal that is
computed from y(n) . This latter mode of operation is known as decision-directed adaptation.
Channel equalization is one of the first applications of adaptive filters and is described in the
pioneering work of Lucky. Today, it remains as one of the most popular uses of an adaptive
filter. Practically every computer telephone modem transmitting at rates of 9600 bits per second
or greater contains an adaptive equalizer. Adaptive equalization is also useful for wireless
communication systems. Qureshi [43] has written an excellent tutorial on adaptive equalization.
A related problem to equalization is deconvolution, a problem that appears in the context of
geophysical exploration.
[55]
System/Plant/Channel
Σ
Adaptive Filter
Σ
Delay
Update Algorithm
η
(n)
+
+
x(n) y(n)
+
e(n)
S(n)
-
In many control tasks, the frequency and phase characteristics of the plant hamper the
convergence behavior and stability of the control system. We can use an adaptive filter shown
in Fig. 4.4 to compensate for the nonideal characteristics of the plant and as a method for
adaptive control. In this case, the signal s(n) is sent at the output of the controller, and the signal
x(n) is the signal measured at the output of the plant. The coefficients of the adaptive filter are
then adjusted so that the cascade of the plant and adaptive filter can be nearly represented by the
pure delay z
-∆
.
Transmission and storing of high density digital information plays an important role in the
present age of information technology. Digital information obtained from audio, video or text
sources needs high density storage or transmission through communication channels.
Communication channels and recording medium are often modeled as band-limited channel for
which the channel impulse response is that of an ideal low pass filter. When sequences of
symbols are transmitted recorded, the low pass filtering of the channel distorts the transmitted
symbols over successive time intervals causing symbols to spread and overlap with adjacent
symbols. This resulting linear distortion is known as inter symbol interference. In addition
nonlinear distortion is also caused by cross talk in the channel and use of amplifiers. In the data
storage channel, the binary data is stored in the form of tiny magnetized regions called bit cells,
arranged along the recording track. At read back, noise and nonlinear distortions (ISI) corrupt
the signal. An ANN based equalization technique has been proposed to alleviate the ISI present
during read back from the magnetic storage channel. Recently, Sun et al [44] have reported an
improved Vitoria detector to compensate the nonlinearities and media noise. Thus adaptive
channel equalizers play an important role in recovering digital information from digital
communication channels/storage media. Preparta had suggested a simple and attractive scheme
for dispersal recovery of digital information based on the discrete Fourier transform.
Subsequently Gibson et al have reported an efficient nonlinear ANN structure for reconstructing
digital signal which has passed through a dispersive channel and corrupted with additive noise.
In a recent publication the authors have proposed optimal preprocessing strategies for perfect
reconstruction of binary signals from a dispersive communication channels. Tourietal have
developed deterministic worst case framework for perfect reconstruction of discrete data
transmission through a dispersive communication channel. In recent past, new adaptive
equalizers have been suggested using soft computing tools such as artificial neural network
[56]
ADDER
ADDER
(ANN), polynomial perception network (PPN) and the functional link artificial neural network
(FLANN). It is reported that these methods are best suited for nonlinear and complex channels.
Recently, Chebyshev artificial neural network has also been proposed for nonlinear channel
equalization [45]. The drawback of these methods is that the estimated weights may likely fall
to local minima during training. For this reason genetic algorithm (GA) [46] and Differential
evolution [19] has been suggested for training adaptive channel equalizers. The main attraction
of GA lies in the fact that it does not rely on Newton like gradient-descent methods, and hence
there is no need for calculation of derivatives. This makes them less likely to be trapped in local
minima. But only two parameters of GA, the crossover and the mutation, help to avoid local
minima problem.
4.3.1 TRANSVERSAL EQUALIZER:
The transversal equalizer uses a time-delay vector, Y (n) (4.7), of channel output samples to
determine the symbol class. The {m} TE notation used to represent the transversal equalizer
specifies m inputs. The equalizer filter output will be classified through a threshold activation
device (Fig4.5) so that the equalizer decision will belong to one of the BPSK states u(n) {−1, ∈
+1}
Y (n) = [y (n), y (n − 1)... y (n − (m − 1))] (4.7)
Considering the inverse of the channel H (z) = 1.0 + 0.5z
−1
that was given in (4.2), this is an
infinitely long convergent linear series:
1 1
( )
( ) 2
1
m
i i
Z
H z
i
-
-
= å
=
Each coefficient of this inverse
model can be used in a linear equalizer as a FIR tap weight. Each tap-dimension will improve
the accuracy; however, high input dimensions leave the equalizer susceptible to noisy samples.
If a noisy sample is received, this will remain within the filter affecting the output from each
equalizer tap. Rather than designing a linear equalizer, a non-linear filter can be used to provide
the desired performance that has a shorter input dimension; this will reduce the sensitivity to
noise.
[57]
ADDER
ADDER
Fig 4.5: Linear Transverse Equalizer
4.3.2 DECISION FEEDBACK EQUALIZER:
A basic structure of the decision feedback equalizer (DFE) is shown in Fig4.6.The DFE consists
of a transversal feed forward and feedback filter. In the case when the communication channel
causes severe ISI distortion, the LTE could not be provide satisfactory performance. Instead, a
DFE is required. The DFE uses past corrected samples, w(n), from a decision device to the
feedback filter and
[58]
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
х
х х х х х х
w
0
w
1
w
2
w
3
w
4
w
5
w
6
x(n)
ū (n)
ADDER
+
+
-
ê(n)
ŷ(n)
Decision
Device
Feed forward Filter
Feedback Filter
c(n)
Z
-1
Z
-1
Z
-1
Z
-1
х
х х х х
y(n)
y(n-1)
y(n-2) y(n-3) y(n-4)
c(n-1) c(n-2)
c(n-3)
c(n-4)
ADDER
û(n)
ū(n)
Equalizer
output
Receiving
Figure 4.6: Decision Feedback Equalizer
Combines with the feed forward filter. In effect, the function of the feedback filter is to subtract
the ISI produced by previously detected symbols from the estimates of future samples. Consider
that the DFE is updated with a recursive algorithm the feed forward filter weights and feedback
filter weights can be jointly adapted by the LMS algorithm on a common error signal eˆ(n) as
shown in (4.8).
W(n+1) = W(n) + µe
^
(n)V(n) (4.8)
where eˆ(n) = u(n) − y(n) and V (n) = [x(n), x(n − 1), ..., x(n − k
1
− 1), u(n −k
2
− l), ...u(n)]
T
.
The feed forward and feedback filter weight vectors are written in a joint vector as W (n) =
[w
0
(n), w
1
(n), ..., w
k1+k2−1
(n)]
T
. k
1
and k
2
represent the feed forward and feedback filter tap
lengths respectively. Suppose that the decision device causes an error in estimating the symbol
u(n). This error can propagate into subsequent symbols until the future input samples
compensate for the error. This is called the error propagation which will cause a burst of errors .
The detrimental potential of error propagation is the most serious drawback for decision
feedback equalization. Traditionally, the DFE is described as being a non-linear equalizer
because the decision device is non-linear. However, the DFE structure is still a linear combiner
[59]
and the adaptation loop is also linear. It has therefore been described as a linear equalizer
structure
4.4. EQUALIZATION USING GA:
High speed data transmission over communication channels distorts the transmitted signals in
both amplitude and phase due to presence of Inter Symbol Interference (ISI). Other impairments
like thermal noise, impulse noise and cross talk also cause further distortions to the received
symbols. Adaptive equalization of the digital channels at the receiver removes/reduces the
effects of such ISIs and attempts to recover the transmitted symbols. Basically an equalizer is a
filter which is placed in cascade with the transmitter and receiver with the aim to have an
inverse transfer function of that of the channel in order to augment accuracy of reception. The
Least-Mean-Square (LMS), Recursive-Least-Square (RLS) and Multilayer perceptron (MLP)
based equalizers aim to minimize the ISI present in the channels particularly for nonlinear
channels. However they suffer from long training time and undesirable local minima during
training. Again the disadvantages or drawbacks of these derivative based algorithms have been
discussed in Chapter-3. In the present chapter we propose a new adaptive channel equalizer
using Genetic Algorithm (GA) optimization technique which is essentially a derivative free
optimization tool. This algorithm has been suitably used to update the weights of the equalizer.
The performance of the proposed equalizer has been evaluated and has been compared with its
LMS based counter part. However being a population based algorithm, the standard Genetic
Algorithm (SGA) suffers from slower convergence rate.
4.4.1 FORMULATION OF CHANNEL EQUALIZATION PROCESS AS
AN OPTIMIZATION PROBLEM:
[60]
An adaptive channel equalizer is basically an adaptive tape-delay digital filter with its order
higher than that of channel filter .A typical diagram of a channel equalizer is shown in Fig.4.7.
At any k
th
instant the equalizer output is given by

1
0
( ) ( ). ( )
N
n
Y k x n k hk
-
=
= +
å
(4.8)
Where N is order of the equalizer filter. The desired signal d(k) at the k
th
instant is formed by
delaying the input sequence x(n+k) by m samples .In actual practice m is usually taken as
2
N
or
1
2
N +
depending upon N is odd or even . That is d(k) =x(n+k-m)
In the beginning of training the initial weights h
n
(0), n=0,1,…………,N-1 are taken to be
random values within certain bound . Subsequently these weights are updated by GA based
adaptive rules. The error signal e(k), at the k
th
instant is given by
( ) ( ) ( ) ek d k Y k = -
(4.9)
In LSM type learning algorithm e
2
(k) instead of e(k) is taken as the cost function for deriving
the steepest descent algorithm because e
2
(k) always positive and represents the instantaneous
power of difference signal.
In adaptive equalizers the parameters which can vary during training are filter weights. The
objective of an adaptive algorithm is to change the filter weights iteratively so that e
2
(k) is
minimized iteratively and subsequently reduced to zero . In developing the GA based
algorithm a set of chromosomes within a bound are selected, each representing the weight
vector of the equalizer
[61]
Initialize the population
(Random set of filter weights
of the equalizer)
Fig.4.7 A 8-tap Adaptive digital channel equalizer.
[62]
Initialize the population
(Random set of filter weights
of the equalizer)
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
х
х
х
х
х
х
х
х
x(7+k)
h
0
(k)
x(6+k) h
1
(k)
x(5+k)
)
h
2
(k)
x(4+k) h
3
(k)
x(3+k) h
4
(k)
x(2+k)
h
5
(k)
x(1+k)
h
6
(k)
x(k) h
7
(k)
Σ
Σ
y(k)
+
-
d(k)
e(k)
Adaptive
Algorithm
Z
-m
Random
Binary
input
. Then GA starts from the initial random strings to proceed repeatedly from generation to
generation through three genetic operators. The selection procedure reproduces highly fitted
individuals who provide minimum mean square error (MMSE) at the equalizer output. A flow
chart of a genetic based adaptive algorithm for channel equalization is shown in fig.4.8.
Fig.4.8. A flow chart of genetic based adaptive algorithm for channel equalizer.
[63]
Initialize the population
(Random set of filter weights
of the equalizer)
Create a new generation through
crossover and mutation operators.
Evaluate the fitness for a
whole population.
(MSE of the equalizer)
Apply selection
(sort based on decreasing
MSE)
Terminate
(When
MMSE is
reached)
Stop
No
Yes
4.4.2. STEPWISE REPRESENTATION OF GA BASED CHANNEL
EQUALIZATION ALGORITHM:
The updating of the weights of the GA based equalizer is carried out using GA rule as outlined
in the following steps:
1. As shown in fig 4.8 is a GA based adaptive equalizer connected in series with channel.
2. The structure of equalizer is a FIR system whose coefficients are initially chosen
from a population of M chromosomes. Each chromosome constitutes NL number of
random binary bits, each sequential bits group of L bits represents one coefficient of the
adaptive model, where N is the number of parameter of the model .
3. Generally K (≥1000) number of input signal which are random binary in nature
4. Each of the input samples is passed through the channel and then contaminated with the
additive noise of known strength . The resultant signal is passed through the equalizer.
In this K number of desired signal are produced by feeding all the K input sample.
5. Each of the input signal is delayed which acts as desired signal.
6. Each of the desired output is compared with corresponding channel output and K error
are produced . The mean square error (MSE) for a given group of parameter
(corresponding to nth chromosome ) is determine by using the relation

2
1
( )
k
i
i
MSE n
k
e
å
=
=
.This is repeated for N times.
7. Since the objective is to minimize MSE(n), n=1 to N the GA based optimization is
used .
8. The crossover, mutation and selection operator are sequentially carried out.
9. In each generation the minimum MSE, MMSE (expressed in dB ) is stored which show
the learning behavior of the adaptive model from generation to generation .
10. When the MMSE has reached a pre- specific level the optimization is stopped.
11. At this step all the chromosomes attend almost identical genes, which represent the
desired filter coefficients of the equalizer.
4.5. SIMULATIONS:
[64]
In this section we carry out the simulation study of the new channel equalizer. The block
diagram of Fig.4.7 is simulated where the channel coefficients are adapted based on the
LMS and GA. The algorithm proposed in section-4.4 is used in the simulation for GA. The
four different channel (2 linear and 2 non linear) and the additive noise in the channel are
-30 dB and -20dB are used for simulation.
The following channel models are used
a. Linear channel coefficients
(i) CH1: [0.2090, 0.9950, 0.2090]
(ii) CH2: [0.3040, 0.9030, 0.3040]
b. Nonlinear Channels
(i) NCH1: ( ) ( ) ( )
2 3
( ) 0.2 0.1 bk a k a k a k = + -
(ii) NCH2: ( ) ( ) ( ) ( ) ( )
2 3
0.2 0.1 0.5cos ( ) b k a k a k a k a k p = + - +
Where
( ) a k
is the output of linear channel and ( ) bk is the output of Non linear channel.
The desired signal is generated by delaying the input binary sequence by m samples where
2
N
m= or
1
2
N +
depending upon N is even or odd. In the simulation study N=8 has been
taken. The convergence characteristics and bit error (BER) plots are obtained from
simulation for different channel in different noisy conditions using LMS and GA are shown
in the following figures.
[65]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-60
-50
-40
-30
-20
-10
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0
Fig.4.9 Plot of convergence characteristic of linear channel CH1 at -30dB using LMS
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-40
-35
-30
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL=0
Fig.4.10 Plot of convergence characteristic of linear channel CH2 at -30dB using LMS
[66]
0 100 200 300 400 500 600
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0


NSR=-20dB
NSR=-30dB
Fig.4.11 Plot of convergence characteristic of linear channel CH1 using GA
0 100 200 300 400 500 600
-15
-10
-5
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL=0


NSR=-20dB
NSR=-30dB
Fig.4.12 Plot of convergence characteristic of linear channel CH2 using GA
[67]
2 4 6 8 10 12 14 16 18 20
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL=0
GA
LMS
Fig.4.13 Comparision of BER of linear channel CH1 between LMS and GA based
equalizer at -30dB noise.
5 10 15 20 25
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL=0
LMS
GA
Fig.4.14 Comparison of BER of linear channel CH2 between LMS and GA based
equalizer at -30dB noise.
[68]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.4.15 Plot of convergence characteristic of Non Linear channel NCH1 for CH1
using LMS.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.4.16 Plot of convergence characteristic of Non Linear channel NCH1 for CH2
using LMS.
[69]
0 100 200 300 400 500 600
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)


NSR=-20dB
NSR=-30dB
Fig.4.17 Plot of convergence characteristic of Non Linear channel NCH1 for CH1
using GA.
0 100 200 300 400 500 600
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)


NSR=-20dB
NSR=-30dB
Fig.4.18 Plot of convergence characteristic of Non Linear channel NCH1 for CH2
using GA
[70]
2 4 6 8 10 12 14 16 18 20 22
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
LMS
GA
Fig.4.19 Comparison of BER of linear channel NCH1 for CH1 between LMS and GA
based equalizer at -30dB noise.
0 5 10 15 20 25 30
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
LMS
GA
Fig.4.20 Comparison of BER of linear channel NCH1 for CH2 between LMS and GA
based equalizer at -30dB noise.
[71]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-20
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
Fig.4.21 Plot of convergence characteristic of Non Linear channel NCH2 for CH1
using LMS
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-20
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
Fig.4.22 Plot of convergence characteristic of Non Linear channel NCH2 for CH2
using LMS.
[72]
0 100 200 300 400 500 600
-15
-10
-5
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)


NSR=-30dB
NSR=-20dB
Fig.4.23 Plot of convergence characteristic of Non Linear channel NCH2 for CH1
using GA.
0 100 200 300 400 500 600
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)


NSR=-30dB
NSR=-20dB
Fig.4.24 Plot of convergence characteristic of Non Linear channel NCH2 for CH2
using GA.
[73]
2 4 6 8 10 12 14 16 18 20 22
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
LMS
GA
Fig.4.25 Comparison of BER of Non linear channel NCH2 for CH1 between LMS and GA
based equalizer at -30dB noise.
2 4 6 8 10 12 14 16
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
GA
LMS
Fig.4.26 Comparison of BER of Non linear channel NCH2 for CH2 between LMS and GA
based equalizer at -30dB noise.
[74]
4.6 RESULTS AND DISCUSSIONS:
The convergence characteristics is obtained from simulation are shown in Fig.4.9 & 4.11 using
LMS and GA respectively for Linear Channel a(i), in Fig.4.10 & 4.12 using LMS and GA
respectively for Linear Channel a(ii).Similarly the bit error rate (BER) plot for channel a(i) is
shown in Fig.4.13 and for channel a(ii) is shown in Fig.4.14.
The convergence characteristics for channel a(i & ii) and b(i ) are shown in Fig.4.14 &4.15
using LMS ,in Fig.4.16 & 4.17 using GA. Similarly the BER plot for channel a(i & ii) and b(i)
are shown in Fig.4.18 & 4.19.
The convergence characteristics for channel a(i & ii) and b(ii ) are shown in Fig.4.20 &4.21
using LMS ,in Fig.4.22 & 4.23 using GA. Similarly the BER plot for channel a(i & ii) and b(i)
are shown in Fig.4.24 & 4.25.
It is observed from the convergence characteristics and BER plot that the GA based equalizers
outperform the corresponding LMS counterparts. This is true for both linear and nonlinear
channels. Under high noise conditions the results of GA based equalizers are distinctly better.
.
[75]
CHAPTER-5
ADAPTIVE SYSTEM IDENTIFICATION
USING DIFFERENTIAL EVOLUTION
5.1. INTRODUCTION:
The identification of linear and nonlinear system are performed in chapter-3 by using LMS
and GA technique .From that chapter we conclude that Linear system is performed well by
using LMS technique and Nonlinear system is performed well by using GA technique. But GA
have more converging time, more computational complexity, required binary bits. To improve
the identification performance of nonlinear systems various techniques such as ANN, FLANN,
RBF, etc are used.
In this chapter we propose a novel model based on DE technique for identification.DE is an
efficient and powerful population based stochastic search technique for solving optimization
problems over continuous space, which has been widely applied in many scientific and
engineering fields. However, the success of DE in solving a specific problem crucially depends
on appropriately choosing trial vector generation strategy and their associated control parameter
values.
5.2. DE BASED OPTIMIZATION
The DE is based on the mechanics of natural selection and the evolutionary behavior of
biological system. It has been successfully applied to diverse fields such as mechanical
Engineering, Communication, pattern recognition. In DE, there exist many trial vector
generation strategies out of which a few may be suitable for solving a particular problem. The
three crucial control parameters involved in DE are population size (NP), scaling factor (F), and
crossover rate (CR) may significantly influence the optimization performance of the DE. Fig.
5.1 shows the basic operation of DE.
[76]
Fig.5.1 Block Diagram of Differential Evolution Algorithm cycle.
5.2.1 OPERATORS OF DE:

(a) Population:
Parameter vectors in a population are randomly initialized and evaluated using the
fitness function. The initial NP D-dimensional parameter vectors, so-called individuals
is
{ }
1
,
, ,
,......,
D
i G
i G i G
X x x =
, i = 1,2,……….,NP
Where NP is number of population vectors ,D is dimension and G is generation.

(b) Mutation:

The mutation operation produces a mutant or noisy vector V
i,G
with respect to each
individual X
i,G
,so- called target vector, in the current population. For each target vector Xi,G

at
[77]
Population/
initialisation
Mutation Crossover
Selection
Fitness
Yes
No
nN
the generation G,its associated mutant vector
{ }
1 2
,
, , ,
, ,........,
D
i G
i G i G i G
V v v v =
can be generated
via certain mutation strategy,i,e

1 2 3
,
, , ,
.
i i i
i G
r G r G r G
V X F X X
æ ö
÷ ç
÷
ç
÷
ç
÷
ç
÷
÷ ç
è ø
= + -
(5.1)

Where
1 2 3
, ,
i i i
r r r
are mutually exclusive integers randomly generated within the range [1, NP],
which are also different from the index i.
F is a positive control parameter for scaling the difference vector called as scaling parameter.
(c) Crossover operation:
After mutation crossover operation is applied to each pair of the target vector X
i,G
and its
corresponding mutant vector V
i,G
to generate a trial vector:
1 2
,
, , ,
, ,........,
D
i G
i G i G i G
U u u u
æ ö
÷ ç
÷
ç
÷ ç
è ø
=

(5.2)
In the basic version, DE employs the binomial (uniform) crossover defined as follows:
(5.3)
Where J=1, 2……..D,
The crossover rate (CR) is a user specified constant within the range [0, 1),
which controls the fraction of parameter values copied from the mutant vector.
And j
rand
is a randomly chosen integer in the range [1, D).

(d)Selection Operation:
In selection operation the objective function values of each trial vector f (U
i,G
) is compared to
that of its corresponding target vector f (X
i,G
) in the current population. If the trial vector has
less or equal objective function value than the corresponding target vector, the trial vector will
[78]
,
,
( [ ,1) ) ( )
,
,
,
j rand
if rand o CR or j j
j
i G
j
i G
otherwise
v
j
i G
x
u
ì
ï
ï
ï
ï
ï
ïï
í
ï
ï
ï
ï
ï
ï
ïî
£ =
=
replace the target vector and enter the population of the next generation. Otherwise, the target
vector will remain in the population for the next generation. The selection operation can be
expressed as follows:
,
,
,
, ,
, 1
,
( ) ( )
i G
i G
U if
i G i G
i G
X
f U f X
otherwise
X
ì
ï
ï
ï
ï
ï
ï
í
ï
ï
ï
ï
ï
ïî
+
£
=
(5.4)
The steps (b,c,d) are repeated generation after generation until some specific termination criteria
are satisfied.
5.3. STEPWISE REPRESENTATION OF DE BASED
ADAPTIVE SYSTEM IDENTIFICATION
ALGORITHM:
i. As shown in fig.3.2 an unknown static dynamic system to be identified is connected is
parallel with an adaptive model to be developed using DE.
ii. The coefficients (â) of the system are initially chosen from population of NL target
vector. Each target vector constitutes D number of random Number, each random
Number represent one coefficient of the adaptive model, where D is the number of
parameters of the model.
iii. Generate k(=500) number of input signal samples each of which is having zero mean
and uniformly distributed between -2√3 to +2√3 and having a variance of unity.
iv. Each of the input samples is passed through the plant P(Z) and the contaminated with
the additive noise of known strength .The resultant signal acts like the desired signal . in
this way k number of desired signals are produced by feeding all the k input samples.
v. Each of the input samples is also passed through the model using each target vector as
model parameters and NP sets of K estimated output are obtained.
[79]
vi. Each of the desired output is compared with corresponding estimated output and K
errors are produced. The mean square error (MSE) for set of parameters (corresponding
to NP
th
target vector) is determined by using relation.
1
2
( )
k
i
i
MSEn
k
e
=
=
å
(5.5)
This is repeated for NP times
vii. Since the objective is to minimize MSE(m),m=1 to NP the GA based optimization is
used.
viii. The mutation operation, crossover operation and selection operation are sequentially
carried out following the steps as given in section-5.2.
ix. In each generation the minimum MSE, MMSE is obtained and plotted against
generation to show the learning characteristics.
x. The learning process is stopped when MMSE reaches the minimum level.
xi. At this step all the individuals attend almost identical parameters, which represent the
estimated parameters of the developed model.
5.4. SIMULATION STUDIES:
To demonstrate the performance of the proposed DE based approach numerous simulation
studied are carried out of several linear and non linear system. The performance of the proposed
structure is compared with corresponding LMS,GA structure.
The block diagram shown in the fig.3.2 is used for simulation study.
Case-1 (Linear System)
[80]
A unit variance random system uniform signal lying in the range of -2√ 3 to +2√3
is applied to known the system having transfer function.
Experiment-1: H (z) =0.2090+ 0.9950Z
-1
+ 0.2090 Z
-2
and
Experiment-2: H (z) =0.2600 + 0.9300 Z
-1
+ 0.2600 Z
-2
The output of the system is contaminated with white Gaussian noise of different strengths of -20
db and -30db. The resultant signal y is used as the desired on the training signal. The same
random input is also applied to the DE based adaptive model having the same linear combiner
structure as that of H (z) but with random initial weights. By adjusting scaling factor (f) and
crossover rate (CR), it has been seen that in linear system the actual and estimated parameters
are same.
Case-2 (Non-Linear System)
In this simulation the actual is assume to be non linear in nature .Computer simulation result of
two different nonlinear system are presented in this case the actual system
Experiment -3: y
n
(k) = tanh{y (k)}
Experiment -4: y
n
(k) = y (k) + 0.2y
2
(k) – 0.1y
3
(k)
Where y (k) is the output of the linear system and y
n
(k) is the output of nonlinear system.
In case of nonlinear system the parameter of two system do not match ,however the responses of
the actual and adaptive model match .To demonstrate this observation training carried out using
DE based algorithm .

[81]
0 10 20 30 40 50 60 70 80 90 100
-30
-25
-20
-15
-10
-5
0
M
S
E

i
n

d
B
generation
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
NSR=-30dB
NSR=-20dB
Fig.5.2. Learning Characteristics of DE based Non linear system identification at
-20dBNSR and -30dBNSR (experiment-3)
[82]
0 5 10 15 20 25 30 35 40 45 50
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t
p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
DE
LMS
Fig.5.3. Comparison of output response of (Experiment-3) at -3odB NSR.
0 10 20 30 40 50 60 70 80 90 100
-30
-25
-20
-15
-10
-5
0
M
S
E

i
n

d
B
generation
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
NSR=-30dB
NSR=-20dB
Fig.5.4. Learning Characteristics of DE based Non linear system identification at
-20dBNSR and -30dBNSR (experiment-3)
[83]
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
p
l
o
t
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
Actual
DE
LMS
Fig.5.5. Comparison of output response of (Experiment-3) at -3odB NSR.
5.5. RESULTS AND DISCUSSIONS:
Fig.5.2 & 5.4 shows the learning characteristics of Non linear channel Experiment-3 followed
by Experiment -1 and Experiment-2 respectively.
Fig.5.3 & 5.5 shows the output response of Non linear channel Experiment-3 followed by
Experiment -1 and Experiment-2 respectively.
The output response of nonlinear system (experiment-3) of DE based is better than the LMS
based and GA based because of response of DE is closer to the desired response by comparing
the Fig.3.13 with Fig.5.3 and Fig.3.19 with Fig.5.5.
[84]

CHAPTER-6
ADAPTIVE CHANNEL EQUALIZATION
USING DIFFERENTIAL EVOLUTION
6.1. INTRODUCTION:
The equalization of linear and nonlinear system are performed in chapter-3 by using LMS and
GA technique .From that chapter we conclude that Linear system is performed well by using
LMS technique and Nonlinear system is performed well by using GA technique. But GA has
more converging time, more computational complexity, required binary bits. To improve the
equalization performance of nonlinear systems various techniques such as ANN, FLANN, RBF,
etc are used.
[85]
In this chapter we propose a novel model based on DE technique for equalization.DE is an
efficient and powerful population based stochastic search technique for solving optimization
problems over continuous space, which has been widely applied in many scientific and
engineering fields. However, the success of DE in solving a specific problem crucially depends
on appropriately choosing trial vector generation strategy and their associated control parameter
values.
6.2. STEPWISE PRESENTATION OF DE BASED
CHANNEL EQUALIZATION ALGORITHM
The updating of the weights of the DE based equalizer is carried out using GA rule as outlined
in the following steps:
i. As shown in fig 4.8 is a DE based adaptive equalizer connected in series with channel.
ii. The structure of equalizer is a FIR system whose coefficients are initially chosen
from a population of NP target vectors. Each target vector constitutes D number of
random binary number. Each random number represents one coefficient of the adaptive
model, where D is the number of parameter of the model.
iii. Generally K (≥1000) number of input signal which are random binary in nature
iv. Each of the input samples is passed through the channel and then contaminated with the
additive noise of known strength. The resultant signal is passed through the equalizer. In
this K number of desired signal are produced by feeding the entire K input sample.
v. Each of the input signal is delayed which acts as desired signal.
vi. Each of the desired output is compared with corresponding channel output and K error is
produced. The mean square error (MSE) for a given group of parameter (corresponding
to nth chromosome ) is determine by using the relation

2
1
( )
k
i
i
MSE n
k
e
å
=
=
.This is repeated for NP times.
[86]
vii. Since the objective is to minimize MSE(n), n=1 to NP the DE based optimization is used
viii. The mutation, crossover and selection operator are sequentially carried out following the
steps as given in section 5.2.
ix. In each generation the minimum MSE, MMSE (expressed in dB) is stored which show
the learning behavior of the adaptive model from generation to generation.
x. When the MMSE has reached a pre- specific level the optimization is stopped.
xi. At this step all the target vectors attend almost identical parameters, which represent the
desired filter coefficients of the equalizer.
6.3. SIMULATIONS:
In this section we carry out the simulation study of the new channel equalizer. The block
diagram of Fig.4.7 is simulated where the channel coefficients are adapted based on the
LMS, GA and DE. The algorithm proposed in section-4.4 is used in the simulation for DE.
The four different channel (2 linear and 2 non linear) and the additive noise in the channel
are -30 dB and -20dB are used for simulation.
The following channel models are used
b. Linear channel coefficients
(i) CH1: [0.2090, 0.9950, 0.2090]
(ii) CH2: [0.3040, 0.9030, 0.3040]
b. Nonlinear Channels
(i) NCH1: ( ) ( ) ( )
2 3
( ) 0.2 0.1 bk a k a k a k = + -
(ii) NCH2: ( ) ( ) ( ) ( ) ( )
2 3
0.2 0.1 0.5cos ( ) b k a k a k a k a k p = + - +
Where
( ) a k
the output of is linear channel and ( ) bk is the output of Non linear channel.
[87]
The desired signal is generated by delaying the input binary sequence by m samples where
2
N
m= or
1
2
N +
depending upon N is even or odd. In the simulation study N=8 has been
taken. The convergence characteristics and bit error (BER) plots are obtained from
simulation for different channel in different noisy conditions using LMS ,GA and DE are
shown in the following figures.
0 50 100 150 200 250 300 350 400 450 500
-12
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.3040,0.9030,0.3040],NL=0
Fig.6.1. plot for convergence characteristic of linear channel CH2 at -30dB NSR.
[88]
0 50 100 150 200 250 300 350 400 450 500
-14
-12
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.6.2. plot for convergence characteristic of Nonlinear channel NCH1 using CH1 at
-30dB NSR.
.
0 50 100 150 200 250 300 350 400 450 500
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=20db
NSR=-30dB
Fig.6.3. plot for convergence characteristic of Nonlinear channel NCH1 using CH2 at
-30dB and -20dB NSR.
[89]
2 4 6 8 10 12 14 16
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL=0
DE
LMS
GA
Fig.6.4. Comparison of BER plot of linear channel CH2 between LMS, GA and DE at
-30dB NSR.
2 4 6 8 10 12 14 16 18 20 22
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=tanhy
LMS
GA
DE
Fig.6.5. Comparison of BER plot of Nonlinear channel NCH2 using CH2
between LMS, GA and DE at -30dB NSR.
[90]
0 5 10 15 20 25 30
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
DE
LMS
GA
.
Fig.6.6. Comparison of BER plot of Nonlinear channel NCH1 using CH2
between LMS, GA and DE at -30dB NSR.
2 4 6 8 10 12 14 16 18 20 22
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+y*(y.
2
)-0.1*(y.
3
)
DE
GA
LMS
Fig.6.7. Comparison of BER plot of Nonlinear channel NCH1 using CH2
between LMS, GA and DE at -20dB NSR.
[91]
6.4 RESULTS AND DISCUSSIONS:
Fig.6.1 shows the convergence characteristic of linear channel CH2 at -30dB NSR.Fig.6.2 &6.3
shows the convergence characteristic of nonlinear channel NCH1 using CH1 at -30dB and using
CH2 at -30dB & -20dB respectively. Fig.6.4 & 6.5 shows the comparison of BER plot between
LMS, GA and DE of linear channel and Nonlinear channel NCH2 using CH2 at -30dB NSR
respectively. Fig.6.6 &6.7 shows the comparison of BER plot between LMS, GA and DE of
Nonlinear channel NCH1 using CH2 at -30dB and -20dB respectively. Using the same channels
and same the same noise conditions the corresponding results are obtained for the LMS, GA
based equalizers. These are used for comparison.
It is observed from the plots of Fig.6.4, 6.5, 6.6 &6.7 that the DE based equalizers outperform
the corresponding LMS and GA counterparts. This is true for both linear and nonlinear
channels.
CHAPTER-7
COCLUSIONS, REFERENCES AND SCOPE
FOR FUTURE WORK:
7.1 COCLUSIONS
The chapter-3 presents a novel GA based adaptive model identification of dynamic non linear
systems. The problem of non linear system identification has been formulated as an MSE
minimization problem. The GA is then successfully used in an iterative manner to optimize the
coefficients of linear and non linear adaptive models. It is demonstrated through simulations
that the proposed approach exbits superior performance compared to its LMS counterpart in
identifying both linear and non linear systems under various additive Gaussian noise conditions.
Thus GA is an useful alternative to the LMS algorithm for non linear system identification.
The chapter-4 proposes a novel adaptive digital channel equalizer using GA based optimization.
Through computer simulation it is shown that the GA based equalizer yield superior
[92]
performance compared to its LMS counterpart. This observation is true for both linear and
nonlinear channels.
The chapter-5 presents a novel DE based adaptive model identification of linear and dynamic
non linear systems. The problem of non linear system identification has been formulated as an
MSE minimization problem. The DE is then successfully used in an iterative manner to
optimize the coefficients of linear and non linear adaptive models. It is demonstrated through
simulations that the proposed approach exbits superior performance compared to its LMS and
GA counterpart in identifying both linear and non linear systems under various additive
Gaussian noise conditions. Thus DE is an useful alternative to the LMS or GA algorithm for
non linear system identification.
The chapter-6 proposes a novel adaptive digital channel equalizer using DE based optimization.
Through simulation it is shown that the DE based equalizers yield superior performance
compared to LMS and GA counterpart. This observation is true for both linear and nonlinear
channels.
7.2 REFERENCES:
[1] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems using
neural networks”, IEEE Trans. on Neural Networks, vol. 1, pp. 4-26, January 1990.
[2] J. C. Patra, A. C. Kot and G. Panda, “An intelligent pressure sensor using neural networks”,
IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829-834, Aug. 2000.
[3] M. Pachter and O. R. Reynolds, “Identification of a discrete time dynamical system”, IEEE
Trans. Aerospace Electronic System, vol. 36, issue 1, pp. 212-225, 2000.
[4] G. B. Giannakis and E. Serpedin, “A bibliography on nonlinear system identification”,
Signal Processing, vol. 83, no. 3, pp. 533-580, 2001.
[5] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986.
[6] D. P. Das and G. Panda, “Active mitigation of nonlinear noise processes using a novel
filtered-s LMS algorithm”, IEEE Trans. on Speech and Audio Processing, vol. 12, issue 3,
pp. 313-322, May 2004.
[7] B. Widrow and S.D. Sterns, “Adaptive Signal Processing” Prentice-Hall, Inc. Engle-wood
Cliffs, New Jersey, 1985.
[8] G. J. Gibson, S. Siu and C. F. N. Cowan, “The application of nonlinear structures to the
reconstruction of binary signals”, IEEE Trans. signal processing, vol. 39, no. 8, pp. 1877-1884,
Aug. 1991.
[93]
[9] R. W. Lucky, Techniques for adaptive equalization of digital communication systems, Bell
Sys.Tech. J., 45, 255-286, Feb. 1966.
[10] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high density
magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005.
[11] L. J. Griffiths, F. R. Smolka and L. D. Trenbly, “Adaptive deconvolution : a new
technique for processing time varying seismic data”, Geohysics, June 1977.
[12] B. Widrow, J. M. McCool, M. G. Larimore and C. R. Johnson, Jr., :Stationary and
nonstationary learning characteristics of the LMS adaptive filter”, Proc. IEEE, vol. 64, no.8, pp.
1151-1162, Aug., 1976.
[13] B. Friedlander and M. Morf, “Least-squares algorithms for adaptive linear phase filtering”,
IEEE Trans., vol. ASSP-30, no. 3, pp. 381-390, June 1982.
[14] S. A. White, “An adaptive recursive digital filter”, Proc. 9th Asilomar Conf. Circuits Syst.
Comput., p. 21, Nov. 1975.
[15] John J. Shynk, “Adaptive IIR filtering”, IEEE ASSP Magazine, April 1989, pp. 4-21.
[16] A. E. Eiben and J. E. Smith, “Introduction to Evolutionary Computing”, Springer, 2003,
ISBN 3-540-40184-9.
[17] Andries Engelbrecht, “Computational Intelligence : An introduction”, Wiley & Sons,
ISBN 0-470-84870-7.
[18] D.E.Goldberg, “Genetic algorithms in search, optimization and machine learning”,
Addition-Wesley,1989.
[19]A.K.Qin, V.L.Huang, and P.N.Suganthan,"Differential Evolution Algorithm with Strategy
Adaptation for Global Numerical Optimization" IEEE Trans. On Evolution
computation,VOL.13,No.2,April.2009.
[20] Konar A (2005), Computational Intelligence: Principles, Techniques and Applications,
Springer, Berlin Heidelberg New York.
[21]. Holland JH (1975), Adaptation in Natural and Artificial Systems, University of Michigan
Press, Ann Arbor.
[22]. Goldberg DE (1975), Genetic Algorithms in Search, Optimization and Machine Learning,
Addison-Wesley, Reading, MA.
[23]. Kennedy J, Eberhart R and Shi Y (2001), Swarm Intelligence, Morgan Kaufmann, Los
Altos, CA.
[94]
[24]. Kennedy J and Eberhart R (1995), Particle Swarm Optimization, In Proceedings of IEEE
International Conference on Neural Networks, pp. 1942–1948.
[25]. Storn R and Price K (1997), Differential Evolution – A Simple and Efficient
Heuristic for Global Optimization Over Continuous Spaces, Journal of Global Optimization,
11(4), 341–359.
[26]. Venter G and Sobieszczanski-Sobieski J (2003), Particle Swarm Optimization, AIAA
Journal, 41(8), 1583–1589.
[27]. Yao X, Liu Y, and Lin G (1999), Evolutionary Programming Made Faster, IEEE
Transactions on Evolutionary Computation, 3(2), 82–102.
[28]. Shi Y and Eberhart RC (1998), Parameter Selection in Particle Swarm Optimization,
Evolutionary Programming VII, Springer, Lecture Notes in Computer Science 1447, 591–600.
[29]. Das S, Konar A, Chakraborty UK (2005), Particle Swarm Optimization with
aDifferentially Perturbed Velocity, ACM-SIGEVO Proceedings of GECCO’ 05,Washington
D.C., pp. 991–998.
[30]. van den Bergh F (1999), Particle Swarm Weight Initialization in Multi-Layer Perceptron
Artificial Neural Networks, Development and Practice of Artificial Intelligence Techniques,
Durban, South Africa, pp. 41–45.
[31] B.Widrow and S.D.Stearns, Adaptive Signal Processing, Chapter-6, pp.99-166,Second
Edition, Pearson.
[32] S. Chan, S.A. Billings and P.M. Grant, "Nonlinear System identification using neural
networks", Int. J.Contr.,Vol. 51,No.6, pp. 1191-1214, June 1990.
[33] J.C. Patra, R.N. Pal, B. N. Chatterji and G. Panda, "Idntification of nonlinear dynamic
systems using functional link artificial neural network", IEEE Trans. On Systems, Man and
Cybernetics-part B: Cybernetics, vol.29,no.2, pp. 254-262, April 1999.
[34] S.V.T Elanayar and Y.C. Shin, "Radial basis function naural network for approximation
and estimation of nonlinear stochastic dynamic systems", IEEE.Trans. Neural Networks, vol.5,
pp. 594-603, July 1994.
[35] C.A. Belfoior & J.H. Park, Jr. "Decission Feedback Equalization", Proc, vol. 67, pp. 1143-
1156, Aug 1979
[36] S.Siu, \Non-linear adaptive equalization based on multi-layer perceptron architecture,"
Ph.D. dissertation, University of Edinburgh, 1990.
[95]
[37] O.Macchi, Adaptive processing, the least mean squares approach with applications in
transmission. West Sussex. England: John Wiley and Sons,1995.
[38] R. W. Lucky, Techniques for adaptive equalization of digital communication systems,Bell
Sys.Tech. J., 45, 255-286, Feb. 1966.
[39] S. K. Nair and Jaekyun Moon, “A theoretical study of linear and nonlinear equalization in
nonlinear magnetic storage channels”, IEEE Trans. on neural networks, vol. 8, no. 5, pp. 1106-
1118, Sept. 1997.
[40] J. C. Patra, A. C. Kot and G. Panda, “An intelligent pressure sensor using neural
networks”, IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829-834,
Aug. 2000.
[41] B. Widrow and E. Walach, Adaptive Inverse Control, Prentice-Hall, Upper Saddle River, NJ,
1996.
[42] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986.
[43] S. U. H Qureshi, Adaptive equalization, Proc. IEEE, 73(9), 1349-1387, Sept. 1985.
[44] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high density
magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005.
[45] J. C. Patra, Wei Beng Poh, N. S. Chaudhari and Amitabha Das,” Nonlinear channel
equalization with QAM signal using Chebyshev artificial neural network”, Proc. Of
International joint conference on neural networks, Montreal, Canada, pp. 3214-3219, August
2005.
[46] G. Panda, B. Majhi, D. Mohanty, A. Choubey and S. Mishra, “Development of Novel
Digital Channel Equalizers using Genetic Algorithms”, Proc. of National Conference on
Communication (NCC-2006), IIT Delhi, pp.117-121, 27-29, January, 2006
7.3 FUTURE WORK:
Comparison of Adaptive system identification and equalization using IIR system between
GA and DE
[96]
[97]

The conventional LMS and recursive least square (RLS) [13] techniques work well for identification of static plants but when the plants are of dynamic type, the existing forwardbackward LMS [14] and the RLS algorithms very often lead to non optimal solution due to premature convergence of weights to local minima [15]. This is a major drawback of the use of existing derivative based techniques. To alleviate this burning issue this thesis suggests the use of derivative free optimization techniques in place of conventional techniques. In recent past population based optimization techniques have been reported which fall under the category of evolutionary computing [16] or computational intelligence [17]. These are also called bio-inspired techniques which include genetic algorithm (GA) and its variants [18], Differential Evolution [19]. These techniques are suitably employed to obtain efficient iterative learning algorithms for developing adaptive direct and inverse models of complex plants and channels. Development of direct and inverse adaptive models essentially consists of two components. The first component is an adaptive network which may be linear or nonlinear in nature. Use of a nonlinear network is preferable when nonlinear plants or channels are to be identified or equalized. The linear networks used in the thesis are adaptive linear combiner or all-zero or FIR structure [7] under nonlinear category GA and DE are used.

1.2 MOTIVATION
In summary the main motivations of the research work carried in the present thesis are the following: i. ii. To formulate the direct and inverse modeling problems as error square optimization problems To introduce bio-inspired optimization tools such as GA and DE and their variants to efficiently minimize the squared error cost function of the models. In other words to develop alternate identification scheme. iii. To achieve improved identification (direct modeling) of complex nonlinear and channel equalization (inverse modeling) of nonlinear noisy digital channels by introducing new and improved updating algorithms.

[2]

1.3 MAJOR CONTRIBUTION OF THE THESIS
The major contribution of the thesis is outlined below i. The GA based approach for both linear and nonlinear system identifications are introduced. The GA based approach is found to be more efficient for nonlinear system than other standard derivative based learning. In addition the DE based identification have been proposed and shown to have better performance and involve less computational complexity. ii. The GA based approach for linear and nonlinear channel equalizations are introduced. The GA based approach is found to be more efficient than other standard derivative based learning. In addition DE based equalizers have been proposed and shown to have better performance and involve less computational complexity.

1.4 CHAPTER WISE CONTRIBUTION
The research work undertaken is embodied in 7 Chapters. • Chapter-1 gives an introduction to System identification, channel equalization and reviews of various learning algorithm such as Least-mean-square (LMS) algorithm, Recursiveleast-square (RLS) algorithm, Artificial Neural Network (ANN), Genetic Algorithm (GA), Differential Evolution (DE) used to identify the system and train to the equalizer. It also includes the motivation behind undertaking the thesis work. • Chapter-2 Discusses about the general form of adaptive algorithm, Adaptive filtering problem, derivative based algorithm such as LMS and overview of derivative free based algorithm such as Genetic Algorithm and Differential Evolution. • Chapter-3 Discusses various system identification technique, Develop the algorithm of GA for simulation on system identification and taking a comparison study between LMS and GA on both linear and nonlinear system.

[3]

Chapter-7 deals with the conclusion of the investigation made in the thesis. GA and DE on both linear and nonlinear system.• Chapter-4 Discusses various channel equalization technique. This chapter also suggests some future research related to the topic. GA and DE on both linear and nonlinear channel equalizers. • • • Chapter-5 Develop the algorithm of DE for simulation on system identification and taking a comparison between LMS. [4] . Chapter-6 Develop the algorithm of DE for simulation on channel equalization and taking a comparison between LMS. Develop the algorithm of GA for simulation on channel equalization and taking a comparison between LMS and GA on both linear and nonlinear channel.

CHAPTER. 2. GA and DE algorithms are outlined in sequel. differential evolution (DE). The derivative based algorithms include least means square (LMS). [5] . Computational complexity involved and minimum mean square error achieved after training. training time. The algorithms for adjusting the coefficients of FIR filters are simpler in general than those for adjusting the coefficients of IIR filters. we only consider an adaptive FIR filter structure. The learning algorithms may be broadly classified into two categories (a) derivative based (b) derivative free.2 GENETIC ALGORITHM AND DIFFERENTIAL EVOLUTION 2. genetic algorithm (GA). bacterial foraging optimization (BFO) and artificial immune system (AIS) have been employed. back propagation (BP) and FLANN-LMS. In this section. we describe the general form of many adaptive FIR filtering algorithms and present a simple derivation of the LMS adaptive algorithm.2 GRADIENT BASED ADAPTIVE ALGORITHIM: An adaptive algorithm is a procedure for adjusting the parameters of an adaptive filter to minimize a cost function chosen for the task at hand. and 2. Under the derivative free algorithms. In our discussion. IIR LMS (ILMS). particle swarm optimization (PSO). The performance of these models depends on rate of convergence. Such systems are currently more popular than adaptive IIR filters because 1. The input-output stability of the FIR filter structure is guaranteed for any set of fixed coefficients.1 INTRODUCTION: There are many learning algorithms which are employed to train various adaptive models. In this section the details of LMS.

2. We now consider one particular cost function that yields a popular adaptive algorithm. and step size.2. input signal vector.2 THE MEAN-SQUARED ERROR COST FUNCTION: The form of G(-) depends on the cost function chosen for the given adaptive filtering task. µ(n) is a step size parameter.1 GENERAL FORM OF ADAPTIVE FIR ALGORITHMS: The general form of an adaptive FIR filtering algorithm is W(n+1)=W(n) + µ(n)G(e(n). In the simplest algorithms. and Φ(n) is a vector of states that store pertinent information about the characteristics of the input and error signals and/or the coefficients at previous time instants. Much research effort has been spent characterizing the role that µ (n) plays in the performance of adaptive filters in terms of the statistical or frequency characteristics of the input and desired response signals.3) [6] . respectively. and the only information needed to adjust the coefficients at time n are the error signal.2.1) Where G(-) is a particular vector-valued nonlinear function. 2. success or failure of an adaptive filtering application depends on how the value of µ(n) is chosen or calculated to obtain the best performance from the adaptive filter. Often. Define the mean-squared error (MSE) cost function as J mse (n) = 1 ∫ e2 ( n ) pn ( e(n) ) de(n) 2 −∞ = +∞ (2. Φ(n) is not used.2) 1 Ee2 n ( ) 2 (2.X(n). The step size is so called because it determines the magnitude of the change or ”step” that is taken by the algorithm in iteratively determining a useful coefficient vector.Φ(n)) (2. e(n) and X(n) are the error signal and input signal vector.

we obtain . WM SE (n) can be found from the solution to the system of equation ∂J mse (n) = 0. and E. we can use a result from optimization theory that states that the derivatives of a smooth cost function with respect to each of the parameters is zero at a minimizing point on the cost function error surface.3) and noting that e(n)=d(n) .4) y(n) = ∑ wi (n) x(n − i) = wT ( n ) X ( n ) [7] respectively. the coefficient values in W(n) that minimize JM SE (n) are welldefined if the statistics of the input and desired response signals are known.0 ≤ (i) ≤ ( L) ∂wi (n) Taking derivative of Jmse(n) in (2. The third point is important in that it enables us to determine both the optimum coefficient values given knowledge of the statistics of d(n) and x(w) as well as a simple iterative procedure for adjusting the parameters of an FIR filter.Where pn(e) represents the probability density function of the error at time n. Hence.y(n) and L−1 i =0 (2. The extension of Wiener’s analysis to the discrete-time case is attributed to Levinson .3).2. 2.is short hand for the expectation integral on the right hand side (2. The formulation of this problem for continuous-time signals and the resulting solution was first derived by Wiener. Thus. and the function is also differentiable. this optimum coefficient vector WM SE (n) is often called the Wiener solution to the adaptive filtering problem. Thus. To determine WM SE (n) we note that the function JM SE (n) in is quadratic in the parameters {wi(n)}. such that it is differentiable with respect to each of the parameters in W(n). indicating that y(n) has approached d(n) and Jmse(n) a smooth function of each of the parameters in W(n).3 THE WIENER SOLUTION: For the FIR filter structure. The MSE cost function is useful for adaptive FIR filters because • • • Jmse(n) has a well-defined minimum with respect to the parameters in W(n)· The coefficient values obtained at this minimum are the ones that minimize the power in the error signal e(n).

Thus.8). ∂J mse (n) ∂(e(n))  = E e(n)  ∂wi (n) ∂wi (n)      (2. so long as the matrix RXX (n) is invertible.6) (2.5) = − Ee(n)  ∂y(n)    ∂wi (n)    (2. By defining the matrix RXX (n) and vector Pdx (n) as RXX = E  X (n) X T (n)     (2. the optimum [8] .7) = − E[e(n) x(n − i)] L −1   E  d ( n) x ( n − i )  − =−  E  x(n − i ) x(n −     j =0  ∑   j ) w j ( n)    (2.8) Where we have used the dentitions of e(n) and of y(n) for the FIR filter structure in and respectively to expand the last result in (2.10) respectively. we can combine the above equations to obtain the system of equations in vector form as RXX (n)WMSE (n) − Pdx (n) = 0 Wiener solution vector for this problem is (2.11) Where 0 is the zero vector.9) Pdx (n) = E d (n) X (n)     (2.

While suitable estimates of the statistical quantities needed for (2. Collecting these equations in vector form. this steepest descent procedure depends on the statistical quantities E{d(n)x(n − i)} and E{x(n − i)x(n − j)} contained in Pdx(n) and Rxx(n) respectively. Substituting these results into yields the update equation for W(n) as W (n + 1) = W (n) + µ (n) ( Pdx (n) − RXX (n)W (n) ) (2.4 THE METHOD OF STEEPEST DESCENT: The method of steepest descent is a celebrated optimization procedure for minimizing the value of a cost function J(n) with respect to a set of adjustable pa-rameters W(n).12) 2. we only have measurements of both d(n) and x(n) to be used within the adaptation procedure. the Ith parameter of the system is altered according to the derivative of the cost function with respect to the Ith parameter.15) could be determined from the signals x(n) and d(n) we instead develop an approximate version of the method of steepest descent that depends on the signal values themselves.−1 WMSE (n) = RXX (n) Pdx (n) (2. This procedure is known as the LMS algorithm.2. This procedure adjusts each parameter of the system according to wi (n +1) = wi (n) − µ (n) ∂J (n) ∂wi (n) (2. we have W (n + 1) = W (n) − µ (n) Where ∂J (n) ∂W (n) (2.15) However.13) In other words.14) ∂J (n) ∂J (n) be the vector form of ∂wi (n) ∂W (n) For an FIR adaptive filter that minimizes the MSE cost function. In practice. we can use the result in to explicitly give the form of the steepest descent procedure in this problem. [9] .

Although it might not appear to be useful. although efficient recursive methods for its minimization can be developed. Since we typically only have measurements of d(n) and of x(n) available to us. Alternatively. the resulting algorithm depends on the statistics of x(n) and d(n) because of the expectation operation that defines this cost function.2.16) Where a ( k ) is a suitable weighting sequence for the terms within the summation. we substitute an alternative cost function that depends only on these measurements. If the MSE cost function in is chosen.2.17) function can be thought of as an instantaneous estimate of the MSE cost function. the resulting algorithm obtained when JLMS (n) is used for J(n) in (2. is complicated by the fact that it requires numerous computations to calculate its value as well as its derivatives with respect to each W(n). we can propose the simplified cost function JLM S (n ) Given by J LMS (n) = 1 e2 (n) 2 (2. however. we obtain the LMS adaptive algorithm given by W (n +1) = W (n) + µ (n)e(n) X (n) [10] (2. This cost function.18) .13). as JMSE(n)=EJLMS (n).13) is extremely useful for practical applications. Taking derivatives of JLMS (n) with respect to the elements of W(n) and substituting the result into (2. One such cost function is the least-squares cost function given by jLMS (n) = ∑ α (k ) k =0 n ( d (k ) −W T (n) X (k ) ) 2 (2.5 THE LMS ALGORITHM: The cost function J(n) chosen for the steepest descent algorithm of determines the coefficient solution obtained by the adaptive filter.

For now. In effect. which is one of the reasons for the algorithm’s popularity. we indicate its useful behavior by noting that the solution obtained by the LMS algorithm near its convergent point is related to the Wiener solution. the number and type of operations needed for the LMS algorithm is nearly the same as that of the FIR filter structure with fixed coefficient values. analyses of the LMS algorithm under certain statistical assumptions about the input and desired response signals show that n→∞ lim E[W (n)] = WMSE (2. which is limited because it is unable to converge to the global optimum on a multimodal error surface if the algorithm is not initialized in the basin of attraction of the global optimum. The problem with these approaches is that the resulting minimum [11] . and numerous results concerning its adaptation characteristics under different situations have been developed. the iterative nature of the LMS coefficient updates is a form of time-averaging that smoothes the errors in the instantaneous gradient calculations to obtain a more reasonable estimate of the true gradient.Note that this algorithm is of the general form in. In fact. Several medications' exist for gradient based algorithms in attempt to enable them to overcome local optima. Other approaches attempt to transform the error surface to eliminate or diminish the presence of local minima . This approach is only likely to be successful when the error surface is relatively smooth with minor local minima. or some information can be inferred about the topology of the surface such that the additional gradient parameters can be assigned accordingly. which would ideally result in a unimodal error surface. the average behavior of the LMS algorithm is quite similar to that of the steepest descent algorithm in that depends explicitly on the statistics of the input and desired response signals. It also requires only multiplications and additions to implement. The problem is that gradient descent is a local optimization technique. One approach is to simply add noise or a momentum term to the gradient computation of the gradient descent algorithm to enable it to be more likely to escape from a local minimum.19) When the Wiener solution WM SE (n) is a fixed vector. In fact. The behavior of the LMS algorithm has been widely studied. Moreover.

This technique does have potential. enunciated by Holland. Unfortunately. Some work has been done with regard to removing the bias of equation error LMS and Steiglitz-McBride adaptive IIR filters. Special emphasis is given on the hybridizations DE algorithms with other soft computing tools. 2. which add further complexity with varying degrees of success. Genetic algorithm (GA). these derivative based optimization techniques can no longer be used to determine the optima on rough non-linear surfaces. By using a similar congregational scheme.3 DERIVATIVE FREE BASED ALGORITHIM: Since the beginning of the nineteenth century. more consistent results using a fewer number of total estimates. initialized with different initial coefficients. but it is inefficient and may still suffer the fate of a standard gradient technique in that it will be unable to locate the global optimum if none of the initial estimates is located in the basin of attraction of the global optimum. These types of algorithms provide the framework for the algorithms discussed in the following sections. Another approach. Bellman’s principle and Pontyagrin’s principle were prevalent until this century. This chapter provides recent algorithms for evolutionary optimization known as deferential evolution (DE). structured stochastic algorithms are able to hill-climb out of local minima. Classical linear programming and traditional non-linear optimization techniques such as Lagrange’s Multiplier. One solution to this problem has already been put forward by the evolutionary algorithms research community. These algorithms also tend to be complex. The notion is that a larger. but one in which information is collectively exchanged between estimates and intelligent randomization is introduced. and may not be guaranteed to emerge from a local minimum. concurrent sampling of the error surface will increase the likelihood that one process will be initialized in the global optimum valley. attempts to locate the global optimum by running several LMS algorithms in parallel. The chapter explores several schemes for controlling the convergence behaviors DE by a judicious selection of their parameters. slow to converge. a significant evolution in optimization theory has been noticed. discontinuous and multimodal surfaces.transformed error used to update the adaptive filter can be biased from the true minimum output error and the algorithm may not be able to converge to the desired minimum error condition. The algorithms are inspired by biological and sociological motivations and can take care of optimality on rough. is one such popular algorithm. This enables the algorithms to achieve better. [12] .

2. [Loop] Go to step 2 9. a [Selection] Select two parent chromosomes from a population according to their fitness (the better fitness. problems are solved by an evolutionary process resulting in a best (fittest) solution (survivor) . There are many parameters [13] . that the new population will be better than the old one. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population 3. Genetic algorithms are inspired by Darwin's theory of evolution.1 OUTLINE OF BASIC GA: 1.4. Solutions which are then selected to form new solutions (offspring) are selected according to their fitness the more suitable they are. and return the best solution in current 7. [Replace] Use new generated population for a further run of the algorithm 6. His idea was then developed by other researchers. the bigger chance to be selected) 5. Solutions from one population are taken and used to form a new population. This is repeated until some condition (for example number of populations or improvement of the best solution) is satisfied. population 8.in other words. the more chances they have to reproduce. The algorithm begins with a set of solutions (represented by chromosomes) called population. The outline of the Basic GA provided above is very general. the solution is evolved. stop. This is motivated by a hope. 2.4 GENETIC ALGORITHM: Genetic algorithms are a class of evolutionary computing techniques. Genetic Algorithms (GAs) were invented by John Holland and developed by him and his students and colleagues . [Test] If the end condition is satisfied. Evolutionary computing was introduced in the 1960s by Rechenberg in his work "Evolution strategies" (Evolutions strategies' in original). [Start] Generate random population of n chromosomes (suitable solutions for the problem) 2. [New population] Create a new population by repeating following steps until the new population is complete 4. This led to Holland's book "Adaption in Natural and Artificial Systems" published in 1975. Simply said. which is a rapidly growing area of artificial intelligence.

sometimes it is useful to encode some permutations and so on. Elitism is often used as a method of selection.2 OPERATORS OF GA: OVERVIEW: The crossover and mutation are the most important parts of the genetic algorithm.4. A chromosome then could look like this: Table-2. The performance is influenced mainly by these two operators. [Crossover] With a crossover probability cross over the parents to form new offspring (children). ENCODING OF A CHROMOSOME: A chromosome should in some way contain information about solution that it represents. that at least one of a generation's best solution is copied without changes to a new population. For example. one can encode directly integer or real numbers. offspring is the exact copy of parents. There are many other ways of encoding.and settings that can be implemented differently in various problems. [Accepting] Place new offspring in the new population 2. CROSSOVER: [14] . Each bit in the string can represent some characteristics of the solution. If no crossover was performed. Which means. The most commonly used way of encoding is a binary string. so the best solution can survive to the succeeding generation a. [Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome). c. b.1 (Encoding of a chromosome) Chromosome 1 Chromosome 2 1101100100110110 1101111000011110 Each chromosome is represented by a binary string. The encoding depends mainly on the problem to be solved.

MUTATION: Mutation is intended to prevent falling of all solutions in the population into a local optimum of the solved problem.Crossover operates on selected genes from parent chromosomes and creates new offspring.2 (crossover of Chromosome) Chromosome 1 Chromosome 2 Chromosome 3 Chromosome 4 11011 І 00100110110 11011 І 11000011110 11011 І 11000011110 11011 І 00100110110 There are other ways how to make crossover. The simplest way how to do that is to choose randomly some crossover point and copy everything before this point from the first parent and then copy everything after the crossover point from the other parent.3(Mutation operation) [15] . Mutation can be then illustrated as follows Table-2. for example we can choose more crossover points. Crossover is illustrated in the following (| is the Crossover point) Table-2. Mutation operation randomly changes the offspring resulted from crossover. In case of binary encoding we can switch a few randomly chosen bits from 1 to 0 or from 0 to 1.

offspring are made from parts of both parent's chromosome. offspring are exact copies of parents.Original offspring 1 1101111000011110 Original offspring 2 Original offspring 3 Original offspring 4 1101100100110110 1100111000011110 1101101100110110 The technique of mutation (as well as crossover) depends mainly on the encoding of chromosomes. For example when we are encoding by permutations. If there is no mutation.crossover probability and mutation probability. it is good to leave some part of old population survives to next generation. 2. If it is 0%. If there is no crossover. mutation could be performed as an exchange of two genes. CROSSOVER PROBABILITY: It indicates how often crossover will be performed. If crossover probability is 100%. If there is crossover.4. MUTATION PROBABILITY: This signifies how often parts of chromosome will be mutated. then all offspring are made by crossover. whole new generation is made from exact copies of chromosomes from old population (but this does not mean that the new generation is the same!). offspring [16] . However. Crossover is made in hope that new chromosomes will contain good parts of old chromosomes and therefore the new chromosomes will be better.3 PARAMETERS OF GA: There are two basic parameters of GA .

nothing is changed. tournament selection. rank selection. Population diversity means that the genes from the already discovered good individuals are exploited while promising the new areas of the search space continue to be explored. steady state selection and some others. TOURNAMENT SELECTION: A selection strategy in GA is simply a process that favors the selection of better individuals in the population for the mating pool. The tournament selection strategy provides selective pressure by [17] . the best ones survive to create new offspring.are generated immediately after crossover (or directly copied) without any change. If mutation probability is 100%. In this thesis we have used the tournament selection as it performs better than the others. then GA has few possibilities to perform crossover and only a small part of search space is explored. One another particularly important parameter is population size. then GA slows down. If mutation is performed. The problem is how to select these chromosomes. population diversity and selective pressure. whole chromosome is changed. Selective pressure is the degree to which the better individuals are favored. POPULATION SIZE: It signifies how many chromosomes are present in population (in one generation). If there are too few chromosomes. Boltzmann selection. Mutation should not occur very often. On the other hand. Mutation generally prevents the GA from falling into local extremes. There are two important issues in the evolution process of genetic search. There are many methods in selecting the best chromosomes. Examples are roulette wheel selection. According to Darwin's theory of evolution. if there are too many chromosomes. one or more parts of a chromosome are changed. SELECTION: The chromosomes are selected from the population to be parents for crossover. because then GA will in fact change to random search. OTHER PARAMETERS: There are also some other parameters of GA. if it is 0%.

where one or more agents are employed to determine the optima on a search landscape. Holland echoed the Darwinian Theory through his most popular and well known algorithm. The optimization problem. Because of the difficulties in evaluating the first Derivatives. Around the same time. Inspired by the natural adaptations of the biological species. In mid 1990s Eberhart and Kennedy enunciated an alternative solution to the complex non-linear optimization problem by emulating the collective behavior of bird flocks. now-a-days. under a set of constraints representing the solution space for the problem. Holland pioneered a new concept on evolutionary search algorithms. Holland and his coworkers including Goldberg and Dejong popularized the theory of GA and demonstrated how biological crossovers and mutations of chromosomes can be realized in the algorithm to improve the quality of the solutions over successive iterations [22]. most of the traditional optimization techniques are centered around evaluating the first derivatives to locate the optima on a given constrained surface. They proposed a new algorithm based on this operator. and consequently came up with a suitable deferential operator to handle the problem.holding a tournament competition among individuals. and came up with a solution to the so far open-ended problem to non-linear optimization problems. They can be [18] . in recent times.5 DIFFERENTIAL EVALUATION: The aim of optimization is to determine the best-suited solution to a problem under a given set of constraints. and called it deferential evolution (DE) [28]. Price and Storn took a serious attempt to replace the classical crossover and mutation operators in GA by alternative operators. several derivative free optimization algorithms have emerged. currently known as genetic algorithms (GA) [21]. the boids method of Craig Reynolds [23] and socio-cognition and called their brainchild the particle swarm optimization (PSO)[23-27]. Mathematically an optimization problem involves a fitness function describing the problem. representing the constrained surface for the optimization problem [20]. 2. is represented as an intelligent search problem. Several researchers over the decades have come up with different solutions to linear and non-linear optimization problems. to locate the optima for many rough and discontinuous optimization surfaces. particles. Unfortunately. Both algorithms do not require any gradient information of the function to be optimized uses only primitive mathematical operators and are conceptually very simple. In the later quarter of the twentieth century.

1. xi. where rand (0. three other parameter vectors (say the r1. . here we discuss one such specific mutation strategy known as DE/rand/1.3(t) . a scalar number F scales the deference of any two of the three vectors and the scaled deference is added to the third one whence we obtain the donor vector Vi(t). . t. which demarcates between the various DE schemes. Now in each generation (or one iteration of the algorithm) to change each population member Xi(t) (say). In this scheme. r2. . there may be a certain range within which value of the parameter should lie for better search results. Therefore. These issues perhaps have a great role in the popularity of the algorithms within the domain of machine intelligence and cybernetics. .2(t). It is the method of creating this donor vector. if the jth parameter of the given problem has its lower and upper bound as xLj and xUj respectively. at time t = t) as Xi(t)= [xi. and r3th vectors) are chosen in a random fashion from the current population. . Since the vectors are likely to be changed over different generations we may adopt the following notation for representing the ith vector of the population at the current generation (i.j (0) = xLj + rand (0. etc. At the very beginning of a DE run or at t = 0. Next. DE is a very simple evolutionary algorithm. Algorithm performance does not deteriorate severely with the growth of the search space dimensions as well. 1) · (xUj − xLj). then we may initialize the jth component of the ith population members as xi. For each search-variable. to create Vi(t) for each ith member.implemented in any computer language very easily and requires minimal parameter tuning.D (t)] (2.e. 2. .1) is a uniformly distributed random number lying between 0 and 1. t+1. problem parameters or independent variables are Initialized somewhere in their feasible numerical range. We will represent subsequent generations in DE by discrete time steps like t = 0. a Donor vector Vi(t) is created. However.. We can express the process for the jth component of each vector as [19] .20) These vectors are referred in literature as “genomes” or “chromosomes”. xi.1 CLASSICAL DE: Like any other evolutionary algorithm.5.1(t). xi. 2. DE also starts with a population of NP D-dimensional search variable vectors.

( xr 2.< n –L+1 >D x (t ) = ij drawn from [1. We also choose another integer L from the interval [1. After a choice of n and L the trial vector U i (t ) = [ui .. < n+1 > D. for a given cost function f. In “Exponential” crossover. (2... Next.. j (t )).22) is formed with ui.. we first choose an integer n randomly among the numbers [0.. j (t ) + F . j (t ) = vi .1 (t ). the donor vector actually contributes to the target. The integer L is [20] .. components with the target vector Xi(t) under this scheme. 2denote constant cost contours.. DE can use two kinds of cross over schemes namely “Exponential” and “Binomial”.. i.. D] according to the following pseudo code. ui . i. j (t ) for j= < n > D. Here the constant cost contours are drawn for the Ackley Function. The donor vector exchanges its “body parts”.. j (t + 1) = xr1.. (2.21) The process is illustrated in Fig..2 (t ). j (t ) − xr 3. L denotes the number of components.. a contour corresponds to f (X) = constant. D–1]...... Closed curves in Fig. from where the crossover or exchange of components with the donor vector starts.. This integer acts as starting point in the target vector.ui .e..Vi.23) Where the angular brackets <>D denote a modulo function with modulus D..e. D]. to increase the potential diversity of the population a crossover scheme comes to play.... 2.……. D (t )] (2.

However.j (t) = vi.1. in “Binomial” crossover scheme. 1) < CR. Illustrating creation of the donor vector in 2-D parameter space (The constant cost contours are for two-dimensional Ackley Function) L=0.j (t) if rand (0. 1. The scheme may be outlined as ui.Fig. a new set of n and L must be chosen randomly as shown above. } While (rand (0. CR is called “Crossover” constant and it appears as a control parameter of DE just like F. the crossover is performed on each of the D variables whenever a randomly picked number between 0 and 1 is within the CR value. Do { L=L+1. For each donor vector V. 1) < CR) AND (L<D). Hence in effect probability (L > m) = (CR)m−1 for any m > 0. [21] .

To keep the population size constant over subsequent generations. End For. DE actually involves the Darwinian principle of “Survival of the fittest” in its selection process which may be outlined as Xi(t + 1) =Ui(t) = Xi(t) if (Ui(t)) ≤ f (Xi(t)).= xi.27) Where f () is the function to be minimized.e. the next step of the algorithm calls for “selection” to determine which one of the target vector and the trial vector will survive in the next generation. (2. For i = 0 to max-iteration do Begin Create Difference-Offspring.t. otherwise the target vector is retained in the population.5.. it replaces its target in the next generation. i. Hence the population either gets better (w. So if the new trial vector yields a better value of the fitness function. End If. Evaluate fitness. Evaluate fitness. at time t = t + 1. [22] . If an offspring is better than its parent Then replace the parent by offspring in the next generation. The DE/rand/1 algorithm is outlined below 2. End.j (t) else……….2 PROCEDURE: Input: Randomly initialized position and velocity of the particles: xi(0) Output: Position of the approximate global optima X ∗ Begin Initialize population.r. f if f (Ui(t)) ≤ f (Xi(t)) (2. the fitness function) or remains constant but never deteriorates.26) In this way for each trial vector Xi(t) an offspring vector Ui(t) is created.

28) Where λ is another control parameter of DE in [0. the vector yielding best suited objective function value at t = t). The general convention used. used to perturb each population member.29) .. To reduce the number of control parameters a usual choice is to put λ = F SCHEME DE/BEST/1 In this scheme everything is identical to DE/rand/1 except the fact that the trial vector is formed as Vi(t + 1) = Xbest(t) + F · (Xr1(t) − Xr2(t)) [23] (2. is created using any two randomly selected member of the population as well as the best vector of the current generation (i. The only difference being that. 2].e. in literature the particular mutation scheme is referred to as DE/rand/1. which demarcates one DE scheme from another. This can be expressed for the ith donor vector at time t = t + 1 as Vi(t + 1) = Xi(t) + λ · (Xbest(t) − Xi(t)) + F · (Xr2 (t) − Xr3(t)) (2. In the former section.5. now the donor vector. we have illustrated the basic steps of a simple DE. Below we outline the other four different mutation schemes. We can now have an idea of how different DE schemes are named. suggested by Price et al. x represents a string denoting the type of the vector to be perturbed (whether it is randomly selected or it is the best vector in the population with respect to fitness value) and y is the number of difference vectors considered for perturbation of x. is DE/x/y.3 THE COMPLETE DE FAMILY: Actually.21) uses a randomly selected vector Xr1 and only one weighted difference vector F · (Xr2 − Xr3) is used to perturb it. Xi(t) is the target vector and Xbest(t) is the best member of the population regarding fitness at current time step t = t. it is the process of mutation.2. Hence. The mutation scheme in (2. SCHEME DE/RAND TO BEST/1 DE/rand to best/1 follows the same procedure as that of the simple DE scheme illustrated earlier. DE stands for DE.

These strategies were derived from the five different DE mutation schemes outlined above. The process can be expressed in the form of an equation as Vi(t + 1) = Xr1(t) + F1 · (Xr2(t) − Xr3(t)) + F2 · (Xr4(t) − X (t)) (2. the donor vector is formed by using two difference vectors as shown below: Vi(t + 1) = Xbest(t) + F · (Xr1(t) + Xr2(t) − Xr3(t) − Xr4(t)) (2.30) Owing to the central limit theorem the random variations in the parameter vector seems to shift slightly into the Gaussian direction which seems to be beneficial for many functions. which are listed below. To reduce the number of parameters we may choose F1 = F2 = F.here the vector to be perturbed is the best vector of the current population and the perturbation is caused by using a single difference vector. DE/best/1/exp [24] . This yielded 5 × 2 = 10 DE strategies. SUMMARY OF ALL SCHEMES: In 2001 Storn and Price [21] suggested total ten different working strategies of DE and some guidelines in applying these strategies to any given problem. SCHEME DE/RAND/2 Here the vector to be perturbed is selected randomly and two weighted difference vectors are added to the same to produce the donor vector. a totality of five other distinct vectors are selected from the rest of the population. SCHEME DE/BEST/2 Under this method. Each mutation strategy was combined with either the “exponential” type crossover or the “binomial” type crossover.31) Here F1 and F2 are two weighing factors selected in the range from 0 to 1. Thus for each target vector.

4 MORE RECENT VARIANTS OF DE: DE is a stochastic.5. To implement the scheme. Xr2(t) and Xr3(t). DE WITH TRIGONOMETRIC MUTATION: Recently. the selected population members are Xr1(t). . Lampinen and Fan [29] has proposed a trigonometric mutation operator for DE to speed up its performance. for each target vector. In what follows we will illustrate some recent medications in DE to make it suitable for tackling the most difficult optimization problems. The strength of the algorithm lies in its simplicity. 2. x represents a string denoting the vector to be perturbed. speed (how fast an algorithm can find the optimal or suboptimal points of the search space) and robustness (producing nearly same results over repeated runs). bin: binomial) 2. N] Where N denotes the population size. The indices r1. where DE stands for DE. three distinct vectors are randomly selected from the DE population. The rate of convergence of DE as well as its accuracy can be improved largely by applying different mutation and selection strategies. evolutionary search algorithm.DE/rand/1/exp DE/rand-to-best/1/exp DE/best/2/exp DE/rand/2/exp DE/best/1/bin DE/rand/1/bin DE/rand-to-best/1/bin DE/best/2/bin DE/rand/2/ The general convention used above is again DE/x/y/z. Suppose the [25] . A judicious control of the two key parameters namely the scale factor F and the crossover rate CR can considerably alter the performance of DE. population-based. and z stands for the type of crossover being used (exp: exponential. r2 and r3 are mutually different and selected from [1. Suppose for the ith target vector Xi(t). y is the number of difference vectors considered for perturbation of x. .

we find that the scheme proposed by Lampinen et al.5. f (Xr2(t)) and f (Xr3(t)).37) Thus.35) Let rand (0. uses trigonometric mutation with a probability of Γ and the mutation scheme of DE/rand/1 with a probability of (1 − Γ).34) (2. The usual choice for this control parameter is a number between 0.4 and 1. 1) and Γ be the trigonometric mutation rate in the same interval (0. Now three weighing coefficients are formed according to the following equations: p = f (Xr1) + f (Xr2) + f (Xr3) p1 = f (Xr1) p p2 = f (Xr2) p p3 = f (Xr3) p (2.objective function values of these three vectors are given by. 1) be a uniformly distributed random number in (0. f (Xr1(t)).5∗ (1 + rand (0.32) (2.38) [26] . 1) by using the relation F = 0.36) (2. The trigonometric mutation scheme may now be expressed as Vi(t + 1) = (Xr1 + Xr2 + Xr3)/3 + (p2 − p1) · (Xr1 − Xr2) + (p3 − p2) · (Xr2 − Xr3) + (p1 − p3) · (Xr3 − Xr1) if rand (0. DERANDSF (DE WITH RANDOM SCALE FACTOR) In the original DE [28] the deference vector (Xr1(t) − Xr2(t)) is scaled by a constant factor “F ”. 1) < Γ Vi(t + 1) = Xr1 + F · (Xr2 + Xr3) else (2. 1). 1)) (2. We propose to vary this scale factor in a random manner in the range (0.33) (2.

This allows for stochastic variations in the amplification of the difference vector and thus helps retain population diversity as the search progresses. a new trial vector has fair chances of pointing at an even better location on the multimodal functional surface. During the later stages it is important to adjust the movements of trial solutions finely so that they can explore the interior of a relatively small space in which the suspected global optimum lies. DETVSF (DE WITH TIME VARYING SCALE FACTOR) In most population-based optimization methods (except perhaps some hybrid global-local methods) it is generally believed to be a good idea to encourage Fig.2.75. Therefore. the tips of the trial vectors) to sample diverse zones of the search space during the early stages of the search. Even when the tips of most of the population vectors point to locations clustered near a local optimum due to the randomly scaled difference vector. 1.where rand (0. Illustrating DETVSF scheme on two-dimensional cost contours of Ackley Function the individuals (here. 1) is a uniformly distributed random number within the range [0. The mean value of the scale factor is 0. 1]. To meet this objective we reduce the value of the scale factor linearly with time from a (predetermined) maximum to a (predetermined) [27] . the fitness of the best vector in a population is much less likely to get stagnant until a truly global optimum is reached. We call this scheme DERANDSF (DE with Random Scale Factor) .

. 2. and r. Np) is a D-dimensional vector. Now we combine these two models using a time-varying scalar weight w ∈ (0. . Apart from this. X2. . Global mutation encourages exploitation. a new DE-variant.39) where Fmax and Fmin are the maximum and minimum values of scale factor F. .minimum value: R = (Rmax − Rmin)∗(MAXIT − iter)/MAXIT (2. The resulting algorithm is referred as DETVSF (DE with a time varying scale factor). . . . For each member of the population a local mutation is created by employing the fittest vector in the neighborhood of the model may be expressed as: Li(t)=Xi(t)+ λ · (Xnbest(t) − Xi(t)) + F · (Xp(t) − Xq (t)) i + k). local mutation. The authors in proposed a neighborhood-based local mutation operator that draws inspiration from PSO.Xi+k. based on the neighborhood topology of the parameter vectors was developed [30] to overcome some of the disadvantages of the classical DE versions. 1) [28] . iter is the current iteration number and MAXIT is the maximum number of allowable iterations. since in general different members of the population are likely to be biased by different individuals. 2. in contrast.41) (2. Xi . XNp ] where each Xi (i = 1. since all members (vectors) of a population are biased by the same individual (the population best). NP). We assume the vectors to be organized in a circular fashion such that two immediate neighbors of vector X1 are XNp and X2. DE WITH LOCAL NEIGHBORHOOD: Only in 2006. The locus of the tip of the best vector in the population under this scheme may be illustrated as in Fig. q ∈ (i − k. where the subscript best indicates the best vector in the entire population. favors exploration.40) where the subscript nbest indicates the best vector in the neighborhood of X i and p. we also use a global mutation expressed as: Gi(t) = Xi(t) + λ · (Xbest(t) − Xi(t)) + F · (Xr(t) − Xs(t)) (2. consisting of vectors Xi−k . Suppose we have a DE population P = [X1. s ∈ (1. . Now for every vector Xi we define a neighborhood of radius k.

42) Where iter is the current iteration number. Thus the algorithm starts at iter = 0 with w = wmin but as iter increases towards MAXIT. it was found that wmax = 0. contribution from the global model increases. emphasis is laid on the local mutation scheme. respectively. This feature is essential at the beginning of the search process when the candidate vectors are expected to explore the search space vigorously. a judicious choice of wmax and wmin is necessary to strike a balance between the exploration and exploitation abilities of the algorithm. wmin denotes.4 seem to improve the performance of the algorithm over a number of benchmark function [29] . wmin ∈ (0. helping DE avoid local optima. Therefore at the beginning. After some experimenting.to form the actual mutation of the new DE as a weighted mean of the local and the global components: Vi(t) = w · Gi(t) + (1 − w) · Li(t). with wmax. MAXIT is the maximum number of iterations allowed and wmax. but with time. In the local model attraction towards a single point of the search space is reduced. The weight factor varies linearly with time as follows: w = wmin + (wmax − wmin) ·iter (2.43) (2. w increases gradually and ultimately when iter = MAXIT w reaches wmax. the maximum and minimum value of the weight. Clearly. 1).8 and wmin = 0.

single output dynamic system is shown in fig(3).Noise is taken into consideration because in many practical cases the system to be modeled is noisy. In this chapter we propose a novel adaptive model based on GA technique for identification of nonlinear systems. There are many ways in which the GAs can be used to solve system identification tasks. Modeling a single input. that is. each individual in the population must represent a model of the plant and the objective becomes a quality measure of the model. The obtained error is a function of the individual’s quality. But most of the dynamic systems exhibit nonlinearity. BASIC PRINCIPLE OF ADAPTIVE SYSTEM IDENTIFICATION: An adaptive filter can be used in modeling that is. is compared with the measurements made on the real plant. The measured output predictions. If this is the case and if the adaptive model is an adaptive linear combiner whose [30] . inherent to each individual i. As less is this error. imitating the behavior of physical dynamic systems which may be regarded as unknown “black boxes” having one or more inputs and outputs. This noise is generally uncorrelated with the plant input. etc. Internal system noise appears at the system output and is commonly represented there as an additive noise.1 INTRODUCTION: Generally the identification of linear system is performed by using LMS algorithm. Radial Basis Function (RBF) [34].CHAPTER -3 ADAPTIVE SYSTEM IDENTIFICATION USING GA 3. To apply GAs in systems identification. 3. The LMS based technique [31] does not perform satisfactory to identify nonlinear system. Functional Link Artificial Neural Network (FLANN) [33]. has internal random disturbing forces. To improve the identification performance of nonlinear systems various techniques such as Artificial Neural Network (ANN) [32].2. by evaluating its capacity of predicting the evolution of the measured outputs. as more performing the individual is.

single output System.. The least square solution will be determined primarily by the impulse response of the system to be modeled.3. the output can be predicted for a given input to the system which is the goal of system identification problem. such that for a particular input. This is not to say that the convergence of the adaptive process will be unaffected by system noise. noise y + e Unknown System x ŷ Adaptive model Σ Σ - Adaptive Algorithm Fig. The problem of determining a mathematical model for an unknown system by observing its input-output data is known as system identification.weights are adjusted to minimize mean square error. the model output matches with the corresponding actual system output . it can be shown that the least squares solution will be unaffected by the presence of plant noise. Which is performed by suitably adjusting the parameters within a given model. It could also be significantly affected by the statistical or spectral character of the system input signal. When the plant behavior is completely unknown it may be characterized using certain adaptive model and then its identification task is carried out using adaptive algorithms like the [31] .After a system is identified.1 Modeling the single input. only that the expected weight vector of the adaptive model after convergence will be unaffected.

2 schematic block diagram of a GA based adaptive identification system In this chapter the modeling is done in an adaptive manner such that after training the model iteratively y and ŷ become almost equal and the squared error becomes almost zero. The minimization of error in an iterative manner is usually achieved by LMS or RLS methods which [32] .3. Fig .LMS. noise y + System P(x) x ŷ Adaptive model Σ Σ - e GA Based Adaptive Algorithm Fig.4 represents a schematic diagram of system identification of time invariant. The objective of identification problem is to construct model generating an output ŷ which approximate the plant output y when subjected to the same input x so that the squared error (e2) is minimum . We list several of these applications here. The system identification task is at the heart of numerous adaptive filtering applications. Acoustic Echo Cancellation Adaptive Noise Cancellation.the operator p describes the dynamic plant . causal discrete time dynamic plant The output of the plant is given by y = p(x) where x is the input which is uniformly bounded function of time . Echo Cancellation for long distance transmission. • • • • • Channel Identification Plant Identification.

ân such that they match with the corresponding parameters a0. After successful training of the model system performance is carried out by feeding zero mean uniformly distributed random input. a1=â1. a2=â2……………an=ân.3. However. in case of nonlinear dynamic system the system parameters do not match but the responses of the system will match. Also the response of actual system(y) coincides with the response of the model system (ŷ).………………an of the actual system p(z) . The updating of the parameters of the model is carried out using GA rule as outlined in the following steps [33] . The GA based model consists of an equal order FIR system with unknown coefficients. a0=â0. Before we proceed to the identification task using GA let us discuss the basics of GA based optimization.2 let the system p(x) be an FIR system represent by the transfer function given by p(z)=a0+a1z-1+a2z-2+a3z-3+………………. a1.+anz-n (3.3. a1. in this chapter the system identification problem is viewed as a squared error minimization problem.. 3. The adaptive modeling constitutes two step.â2..a2 . In essence. DEVELOPMENT OF GA BASED ALGORITHEM FOR SYSTEM IDENTIFICATION: Referring to Fig.…………….e. The purpose of the adaptive identification model is to estimate the unknown coefficients â0. The shortcoming of this method is that for certain type of plant the squared error cannot be optimally minimized due to error surface falling to local minima.â1. The measurement noise of the system is given by n(k) which is assumed to be white and Gaussian distributed .1) Where a0. a2 …an represent the impulse response (parameter) of the system . In this chapter we propose a novel and elegant method which employs Genetic algorithm for minimizing the squared error in a derivative free manner.are basically derivative based. The input x is also uniformly distributed white noise lying between -2√3 to +2√3 and have a variance of unity. In the first step the model is trained using GA based updating technique. if the system is exactly identified(theoretically) then in case of a linear system (for example the FIR system ) the system parameters and the model parameters become equal i.

mutation and selection operator are sequentially carried out following the steps as given in section-3. where N is the number of parameters of the model. each sequential group of L-bits represent one coefficient of the adaptive model. VIII.2 an unknown static dynamic system to be identified is connected is parallel with an adaptive model to be developed using GA. in this way k number of desired signals are produced by feeding all the k input samples. crossover. Each chromosome constitutes NL number of random binary bits.2) k This is repeated for M times VII. k å e2 i MEn ) = S ( i =1 (3.3. II. IV.I. III. Generate k(=500) number of input signal samples each of which is having zero mean and uniformly distributed between -2√3 to +2√3 and having a variance of unity. The coefficients (â) of the system are initially chosen from population of M chromosomes. As shown in fig.The resultant signal acts like the desired signal .m=1 to M the GA based optimization is used. VI. The tournament selection. The mean square error (MSE) for set of parameters (corresponding to mth chromosome) is determined by using relation. V. Each of the desired output is compared with corresponding estimated output and K errors are produced. Each of the input sample is also passed through the model using each chromosome as model parameters and M sets of K estimated output are obtained. [34] . Since the objective is to minimize MSE(m). Each of the input samples is passed through the plant P(Z) and the contaminated with the additive noise of known strength .3.

[35] . The coefficients or weights of the linear combiner are updated using LMS algorithm as well as the proposed GA based algorithm.9950Z-1 + 0. Experiment-1: H (z) =0.2090 Z-2 and Experiment-2: H (z) =0. SIMULATION STUDIES: To demonstrate the performance of the proposed GA based approach numerous simulation studied are carried out of several linear and non linear system.axis.2600 + 0. XI. The same random input is also applied to the GA based adaptive model having the same linear combiner structure as that of H (z) but with random initial weights. The learning process is stopped when MMSE reaches the minimum level. The performance of the proposed structure is compared with corresponding LMS structure. In each generation the minimum MSE.3. The training become complete when MSE in dB become parallel to x. the parameter ais match with the corresponding estimated parameter âis from the proposed system. X. The block diagram shown in the Fig. for a linear system. Under this condition.2600 Z-2 The output of the system is contaminated with white Gaussian noise of different strengths of -20 db and -30db.2090+ 0.IX.2is used for simulation study Case-1 (Linear System) A unit variance random system uniform signal lying in the range of -2√ 3 to +2√3 is applied to known the system having transfer function. 4. MMSE is obtained and plotted against generation to show the learning characteristics. At this step all the chromosomes attend almost identical genes. The resultant signal y is used as the desired on the training signal. 3. which represent the estimated parameters of the developed model.9300 Z-1 + 0.

2582 0.9943 0.2566 0.2631 0.2077 0.2090 0.9308 0.1 comparison of actual and estimated parameters of LMS and GA based models Experiment Actual Parameter 0.2153 0.2061 0. From this table it is observed that the GA based model performs better than that of LMS based models under different noise conditions.2598 01 02 [36] .2092 0.2090 0.9289 0.9300 0.1 we represent actual and estimated parameter of a 3-tap linear combiner obtained by the LMS as well as GA models.2600 Estimated parameters LMS Based GA Based NSR = -30 NSR = -20 NSR = -30 NSR = -20 dB 0.2064 1.9985 0.9342 0.2100 0. Table-3.In Table -3.2563 dB 0.9941 0.9301 0.2598 dB 0.2077 0.9950 0.0094 0.2071 0.2624 dB 0.2705 0.2600 0.

3 Learning Chacteristics of LMS based Linear System Identification (Experiment-1) CH:[0.0.2090.0.9300.2600].NL=0 0 NSR=-30dB NSR=-20dB -5 -10 MSE IN dB -15 -20 -25 -30 0 500 1000 1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES) 4500 5000 Fig.3.0.0.4 Learning Chacteristics of LMS based Linear System Identification (Experiment-2) [37] .CH:[0.NL=0 0 NSR=-20dB NSR=-30dB -5 -10 MSE IN dB -15 -20 -25 -30 0 500 1000 1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES) 4500 5000 Fig.2090].3.9950.2600.

9950.2600.2090].2600].NL=0 0 NSR=-20dB NSR=-30dB -5 -10 Mean square error in dB -15 -20 -25 -30 -35 0 10 20 30 40 50 60 Generation 70 80 90 100 Fig.CH:[0.NL=0 0 NSR=-30dB NSR=-20dB -5 -10 Mean square error in dB -15 -20 -25 -30 -35 0 10 20 30 40 50 60 Generation 70 80 90 100 Fig.0.5 Learning Characteristics of GA based Linear System Identification (Experiment-1) CH:[0.3.2090.0.0.9300.0.3.6 Learning Characteristics of GA based Linear System Identification (Experiment-2) [38] .

Case-2 (Non-Linear System) In this simulation the actual is assume to be non linear in nature .Computer simulation result of two different nonlinear system are presented in this case the actual system Experiment -3: yn (k) = tanh{y (k)} Experiment -4: yn (k) = y (k) + 0.2y2 (k) – 0.1y3 (k) Where y (k) is the output of the linear system and yn (k) is the output of nonlinear system . In case of nonlinear system the parameter of two system do not match ,however the responses of the actual and adaptive model match .To demonstrate this observation training carried out using both LMS and GA based algorithm .

CH:[0.2090,0.9950,0.2090],NL:y=tanh(y) 0 NSR=20dB NSR=-30dB -5

-10 MSE IN dB

-15

-20

-25

-30

0

500

1000

1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES)

4500

5000

Fig.3.7 Learning Chacteristics of LMS based Non Linear System Identification (Experiment-3)

[39]

CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y2)-0.1(y3) 0 NSR=-30dB NSR=-20dB -5

MSE IN dB

-10

-15

-20

-25

0

500

1000

1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES)

4500

5000

Fig.3.8 Learning Chacteristics of LMS based Non Linear System Identification (Experiment-4)

CH:[0.2090,0.9950,0.2090],NL:y=tanhy 0 NSR=-30dB NSR=-20db -5

Mean square error in dB

-10

-15

-20

-25

-30

0

100

200

300 Generation

400

500

600

Fig.3.9 Learning Chacteristics of GA based Non Linear System Identification (Experiment-3) [40]

CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y. 2)=0.1*(y. 3) 0 NSR=-30dB NSR=-20dB -5 Mean square error in dB

-10

-15

-20

-25

0

100

200

300 Generation

400

500

600

Fig.3.10 Learning Chacteristics of GA based Non Linear System Identification (Experiment-4)
CH:[0.2090,0.9950,0.2090],NL:y=tanhy 0.6 Actual GA LMS

0.4

0.2

0 out put

-0.2

-0.4

-0.6

-0.8

0

5

10

15

20

25

30

35

40

45

50

Fig.3.11 Comparision of Output response of (Experiment-3) at -30dBNSR. [41]

4 0.3.6 -0.4 -0.2600].0.0. 2)-0.6 Actual GA LMS 0.13 Learning Chacteristics of LMS based Non Linear System Identification (Experiment-3) [42] .1*(y.CH:[0.9950.2 0 out put -0.0.0.2600.NL:y=y+0.8 0 5 10 15 20 25 30 35 40 45 50 Fig.9300.2090]. CH:[0.12 Comparison of Output response of (Experiment-4) at -30dBNSR.2 -0.2*(y.2090.3.NL:y=tanh(y) 0 NSR=-20dB NSR=-30dB -5 -10 MSE IN dB -15 -20 -25 -30 0 500 1000 1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES) 4500 5000 Fig.3) 0.

3.Nl.0.CH:[0.1*(y.2600.2600].9300.2600].9300.3.0.y=tanhy 0 NSR=-30dB NSR=-20dB -5 Mean square error in dB -10 -15 -20 -25 -30 0 100 200 300 Generation 400 500 600 Fig.14 Learning Chacteristics of LMS based Non Linear System Identification (Experiment-4) CH:[0.2)-0.0.3) 0 NSR=-30dB NSR=-20dB -5 MSE IN dB -10 -15 -20 -25 0 500 1000 1500 2000 2500 3000 3500 4000 NUMBER OF ITERATIONS(SAMPLES) 4500 5000 Fig.0.2*(y.15 Learning Chacteristics of GA based Non Linear System Identification (Experiment-3) [43] .NL:y=y+0.2600.

1 -0.4 Actual GA LMS 0 5 10 15 20 25 30 35 40 45 50 Fig.2600].9300.0.2600].3 -0.5 0.2 -0.NL:y=y+0.NL:y=tanhy 0.6 0.9300.17 Comparison of Output response of (Experiment-3) at -30dBNSR.0.CH:[0.0. 2)-0.16 Learning Chacteristics of GA based Non Linear System Identification (Experiment-4) CH:[0.3 0.4 0.2600.3.2600.3.1*(y. [44] .1 0 -0.0.2*(y.2 out put 0. 3) 0 NSR=-30dB NSR=-20dB -5 Mean square error in dB -10 -15 -20 -25 0 100 200 300 Generation 400 500 600 Fig.

0.3.2600]. Similarly the MSE plots of experiment-3 and experiment-4 followed by experiment-2 for two different noise conditions for LMS based algorithm are obtained by simulation and shown in Fig.9300.15 &3.17 &3.0.14 respectively .3.2*(y.3.CH:[0.8 respectively .11 &3.NL:y=y+0.3.1*(y.The corresponding plots for the same system for GA based model are shown in Fig. [45] .3.18 respectively.3.2 -0.13 &3.4 -0.4 0.8 Actual GA LMS 0.18 Comparison of Output response of (Experiment-4) at -30dBNSR The MSE plots of experiment-3 and experiment-4 followed by experiment-1 for two different noise conditions for LMS based algorithm are obtained by simulation and shown in Fig.7 &3.The corresponding plots for the same system for GA based model are shown in Fig.2 out put 0 -0.3.10 respectively. 2)-0. The comparison of output responses of the two nonlinear models using LMS and GA techniques are shown in Fig. 3) 0.6 0 5 10 15 20 25 30 35 40 45 50 Fig. Similar results are also observed in case of other non linear models and under various noise conditions. The comparison of output responses of the two nonlinear models using LMS and GA techniques are shown in Fig.16 respectively.9 &3.6 0.2600.12 respectively.

5.9) for both noise cases. The output response of nonlinear system (Experiment-3) of GA is better than the LMS counter part because of GA is closer to the desired response (Fig.3. RESULTS AND DISCUSSIONS: Table-1 reveals that for FIR linear system the coefficients of adaptive model using LMS are matched closely with the coefficients of actual system in comparison with GA.Hence for linear FIR system LMS works well. But the same is much improved in case of GA (Fig. [46] .11). For nonlinear system the learning characteristics of LMS technique is poor (Fig.13).

CHAPTER-4 ADAPTIVE CHANNEL EQUALIZATION USING GENETIC ALGORITHM. However. On the other hand the GA and DE are derivative free technique and hence the local minima problem does not arise during weight updates. Equalization is the process of recovering the data sequence from the corrupted channel samples. A channel can interfere with the transmitted data through three types of distorting effects.1 INTRODUCTION: The digital communication system suffers from the problem of ISI which essentially deteoriates the accuracy of reception. this is not the case for real communication channels. Such equalizers use gradient based weight update algorithm and therefore there is a possibility that during training of the equalizers its weight do not attain to their optimal values due to the MSE being trapped to local minimum.2 BASIC PRINCIPLE OF CHANNEL EQUALIZATION: In an ideal communication channel. The present chapter has developed a novel GA based adaptive channel equalizer. A typical baseband transmission system is depicted in Fig.1. 4. Conventially the LMS algorithm is employed to design and develop adaptive equalizers [35]. multi-path time dispersions and background thermal noise [36]. the received information is identical to that transmitted. where signal distortions take place. An adaptive digital channel equalizer is essentially an inverse system of the channel model which primarily compacts the effect of ISI. Power degradation and fades.4. [47] . 4. where an equalizer is incorporated within the receiver. The probability of error at the receiver can be minimized and can be reduced to an acceptable level by introducing an equalizer at the front end of the receiver.

each having a different attenuation and delay.1 this would occur if the symbol transmission rate is greater than 1/τ where.1. multi-path interference commonly occurs. In practical terms this is equivalent to transmitting the same signal through a number of separate channels. Because bandwidth efficiency leads to high data rates. In Fig4. [48] . A Baseband Communication System 4.2. Consider an open-air radio transmission channel that has three propagation paths.noise Input Transmitter Filter Channel Medium + Receiver Filter EQUALISER Out put Fig. Multipath interference between consecutively transmitted signals will take place if one signal is received whilst the previous signal is still being detected. 4. as illustrated in Fig4.2. These could be direct.1 MULTIPATH PROPAGATION: Within telecommunication channels multiple paths of propagation commonly occur. τ represents transmission delay. earth bound and sky bound.

4. (a) The signal transmitted paths.1) [49] . illustrated by (4. The inverse of a minimum phase [37] channel is convergent. the channel are termed minimum phase.2 Impulse Response of a transmitted signal in a channel which has 3 modes of propagation.Multiple Transmission Paths Transmitter Sky Bound Direct Earth Bound (a) Receiver Signal Strength at Receiver Direct Earth Bound Sky Bound τ (b) Fig.2.2 MINIMUM & NON MINIMUM PHASE CHANNELS: When all the roots of the H(Z) lie within the unit circle. (b) The received samples 4.

However.25Z 2 ...3 + .1 å ( )i Z..0..2) Since equalizers are designed to invert the channel distortion process they will in effect model the channel inverse.5Z ¥ .5Z.125Z. as shown in (4.0 + 0..1) Whereas the inverse of non-minimum phase channels are not convergent.2). limiting the inverse model to m-dimensions will [50] .2 .i i =0 2 1 .5z.0 + 0..0..5Z + 0...0 + 0.1 H (z) 1 1.1 H (z) Z 1..ì ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï H (z) = ï í ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï î 1 1.. The minimum phase channel has a linear inverse model therefore a linear equalization solution exists. (4.5Z.0.[ å ( )i Z. ì ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï H (z) = í ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï î 1 1.1 Z.0.[1 ..1 + 0.1 ¥ .125Z 3] (4.i ] i =0 2 Z.25Z.0 + 0.5Z..

3 .1 . A linear inverse of a non-minimum phase channel does not exist without incorporating time delays. is included to assist in the 4.3 INTERSYMBOL INTERFERENCE: Inter-symbol interference (ISI) has already been described as the overlapping of the transmitted data.approximate the solution and it has been shown that non-linear solutions can provide a superior inverse model in the same dimension. −1 +0. D.8760z classification so that the desired output becomes u(n − D).125Z + . ì ï ï ï ï ï ï ï ï ï ï ï ï ï ï H (z) = ï í ï ï ï ï ï ï ï ï ï ï ï ï ï ï ï î 1 0.....2.(truncatedandcausal ) (4.0Z.0. A time delay creates a convergent series for a non-minimum phase model. The latter of these is the more suitable form for a linear filter. Increasing the dimensionality of the channel output vector helps characterize the multipath propagation.0.0. where longer delays are necessary to provide a reasonable equalizer..2 + 0...125Z 3 + .5Z + 0.5 + 1..3410+0.... It is difficult to recover the original data from one channel sample dimension because there is no statistical information about the multipath propagation.3410z −2 is used throughout this thesis for simulation purposes.25Z...4 H (z) Z.(noncausal )Z.0. A channel delay. (4..3) The three-tap non-minimum phase channel H (z) = 0. This has the affect of not only increasing the number of symbols but also increases the Euclidean distance between the [51] ..5Z 1 1 ...1Z.5Z.1 H (z) 1 1 + 0.25Z 2 .3) describes a non-minimum phase channel with a single delay inverse and a four sample delay inverse.

Fig4. a class error will occur.3 shows two Gaussian functions that could represent two symbol noise distributions. 4.3 Interaction between two neighboring symbols When additive Gaussian noise.4. Taking any two neighboring symbols. where the noise can cause the symbol clusters to interfere. assuming Gaussian noise. µ1 µ2 б1 б2 Area of overlap = Probability of error L Fig.4 SYMBOL OVERLAP: The expected number of errors can be calculated by considering the amount of symbol interaction. Error control coding schemes can be employed in such cases but these often require extra bandwidth. the input sample will form Gaussian clusters around the symbol centers. Once this occurs. the cumulative distribution function (CDF) can be used to describe the overlap between the two noise characteristics.output classes. η is present within the channel. 4. equalization filtering will become inadequate to classify all of the input samples. L. These symbol clusters can be characterized by a probability density function (PDF) with a noise variance ση2. The Euclidean distance. between symbol canters and the noise variance [52] s 2 can be used in the . The overlap is directly related to the probability of error between the two symbols and if these two symbols belong to opposing classes.

as in (4. where NSP is the number of symbols in the positive class.6) [53] .6) describes the BER based upon the Gaussian noise overlap. The probability of error is more commonly described as the BER.¥ 1 exp ê x2 ú ê ú dx ê 2ú 2Õ s ê2 ú s ë û é ù (4. (4.5).cumulative distribution function of (4. ¥ CDF (x) = ò .4) to calculate the area of overlap between the two symbol noise distributions and therefore the probability of error. Nm is the number of number of symbols in the negative class and Di is the distance between the Ith positive symbol and its closest neighboring symbol in the negative class. é ù Nsp ê Di ú ê ú 2 BER(sn ) = logê å CDF (2s )ú ê ú N + Nm i = 1 n ú ê sp ê ú ë û (4.5) Since each channel symbol is equally likely to occur the probability of unrecoverable errors occurring in the equalization space can be calculated using the sum of all the CDF overlap between each opposing class symbol.4) é ù L P (c) = 2 CDF ê ú ê ú ê2 ú ë û (4.

Similarly if this system is the models of a high density recording medium then its corresponding inverse model reconstruct the recorded data without distortion [39]. The goal of the adaptive filter is to adjust its characteristics such that the output signal is an accurate representation of the delayed source signal. Sometimes the inverse model response contains a delay which is deliberately incorporated to improve the quality of the fit. The output of the adaptive filter is subtracted from a desired response signal that is a delayed version of the source signal. If the system represents a nonlinear sensor then its inverse model represents a compensator of environmental as well as inherent nonlinearities [40]. There are many applications of adaptive inverse model of a system. The adaptive inverse model also finds applications in adaptive control [41] as well as in deconvolution in geophysics application [42] [54] . 4.4. a source signal s(n) is fed into an unknown system that produces the input signal x(n) for the adaptive filter. such that d(n) = s(n . If the system is a communication channel then the inverse model is an adaptive equalizer which compensates the effects of inter symbol interference (ISI) caused due to restriction of channel bandwidth [38].D) Where ∆ is a positive integer value.3 CHANNEL EQUALIZATION: The inverse model of a system having an unknown transfer function is itself a system having a transfer function which is in some sense a best fit to the reciprocal of the unknown transfer function. In Fig.4.

The transmitter sends a sequence s(n) that is known to both the transmitter and receiver.Delay η (n) System/Plant/Channel + + Adaptive Filter + y(n) S(n) Σ x(n) Σ e(n) Update Algorithm Fig. 4.D) of the known transmitted signal. Adaptive equalization is also useful for wireless communication systems. [55] . Channel equalization is one of the first applications of adaptive filters and is described in the pioneering work of Lucky. in equalization. A related problem to equalization is deconvolution. which adjusts its characteristics so that its output closely matches a delayed version S(n .4: Inverse Modeling Channel equalization is a technique of decoding of transmitted signals across non ideal Communication channels. Today. the coefficients of the system either are fixed and used to decode future transmitted messages or are adapted using a crude estimate of the desired response signal that is computed from y(n) . This latter mode of operation is known as decision-directed adaptation. After a suitable adaptation period. it remains as one of the most popular uses of an adaptive filter. However. the received signal is used as the input Signal x(n) to an adaptive filter. Qureshi [43] has written an excellent tutorial on adaptive equalization. a problem that appears in the context of geophysical exploration. Practically every computer telephone modem transmitting at rates of 9600 bits per second or greater contains an adaptive equalizer.

Digital information obtained from audio. In recent past. In the data storage channel. This resulting linear distortion is known as inter symbol interference. The coefficients of the adaptive filter are then adjusted so that the cascade of the plant and adaptive filter can be nearly represented by the pure delay z-∆. 4. Preparta had suggested a simple and attractive scheme for dispersal recovery of digital information based on the discrete Fourier transform. the signal s(n) is sent at the output of the controller. In this case. Sun et al [44] have reported an improved Vitoria detector to compensate the nonlinearities and media noise. When sequences of symbols are transmitted recorded. Recently. Subsequently Gibson et al have reported an efficient nonlinear ANN structure for reconstructing digital signal which has passed through a dispersive channel and corrupted with additive noise. Transmission and storing of high density digital information plays an important role in the present age of information technology. Communication channels and recording medium are often modeled as band-limited channel for which the channel impulse response is that of an ideal low pass filter. new adaptive equalizers have been suggested using soft computing tools such as artificial neural network [56] . the binary data is stored in the form of tiny magnetized regions called bit cells. In addition nonlinear distortion is also caused by cross talk in the channel and use of amplifiers. We can use an adaptive filter shown in Fig. arranged along the recording track.4 to compensate for the nonideal characteristics of the plant and as a method for adaptive control. Tourietal have developed deterministic worst case framework for perfect reconstruction of discrete data transmission through a dispersive communication channel. At read back. and the signal x(n) is the signal measured at the output of the plant. Thus adaptive channel equalizers play an important role in recovering digital information from digital communication channels/storage media. noise and nonlinear distortions (ISI) corrupt the signal. the frequency and phase characteristics of the plant hamper the convergence behavior and stability of the control system. video or text sources needs high density storage or transmission through communication channels. In a recent publication the authors have proposed optimal preprocessing strategies for perfect reconstruction of binary signals from a dispersive communication channels. the low pass filtering of the channel distorts the transmitted symbols over successive time intervals causing symbols to spread and overlap with adjacent symbols. An ANN based equalization technique has been proposed to alleviate the ISI present during read back from the magnetic storage channel.In many control tasks.

this will remain within the filter affecting the output from each equalizer tap.i H (z) i = 1 2 Each coefficient of this inverse model can be used in a linear equalizer as a FIR tap weight.5) so that the equalizer decision will belong to one of the BPSK states u(n) ∈ {−1. [57] .3. Recently.2). Chebyshev artificial neural network has also been proposed for nonlinear channel equalization [45]. however. Rather than designing a linear equalizer. But only two parameters of GA. The main attraction of GA lies in the fact that it does not rely on Newton like gradient-descent methods. Y (n) (4.1 TRANSVERSAL EQUALIZER: The transversal equalizer uses a time-delay vector. This makes them less likely to be trapped in local minima.7).. 4.. The {m} TE notation used to represent the transversal equalizer specifies m inputs. y (n − 1).0 + 0. polynomial perception network (PPN) and the functional link artificial neural network (FLANN). The equalizer filter output will be classified through a threshold activation device (Fig4. a non-linear filter can be used to provide the desired performance that has a shorter input dimension.(ANN). The drawback of these methods is that the estimated weights may likely fall to local minima during training. of channel output samples to determine the symbol class. It is reported that these methods are best suited for nonlinear and complex channels. and hence there is no need for calculation of derivatives. the crossover and the mutation. this is an infinitely long convergent linear series: m . If a noisy sample is received.1 1 = å ( )i Z.7) that was given in (4. this will reduce the sensitivity to noise. high input dimensions leave the equalizer susceptible to noisy samples.5z −1 (4. help to avoid local minima problem. For this reason genetic algorithm (GA) [46] and Differential evolution [19] has been suggested for training adaptive channel equalizers. Each tap-dimension will improve the accuracy. y (n − (m − 1))] Considering the inverse of the channel H (z) = 1. +1} Y (n) = [y (n).

Receiving Z-1 Z-1 Z-1 Z-1 y(n) y(n-1) y(n-2) y(n-3) y(n-4) c(n) х c(n-1) х c(n-2) х c(n-3) х х c(n-4) û(n) ADDER ū(n) Equalizer output Fig 4. a DFE is required. Instead.2 DECISION FEEDBACK EQUALIZER: A basic structure of the decision feedback equalizer (DFE) is shown in Fig4. The DFE uses past corrected samples. In the case when the communication channel causes severe ISI distortion.5: Linear Transverse Equalizer 4.The DFE consists of a transversal feed forward and feedback filter.3. the LTE could not be provide satisfactory performance.6. from a decision device to the feedback filter and Feed forward Filter x(n) Z-1 Z-1 Z-1 Feedback Filter Z-1 Z-1 ū (n) w0 х w1 х w2 х w3 х х w4 х w5 х w6 ADDER ê(n) ŷ(n) [58] Decision Device + + . w(n).

However. This error can propagate into subsequent symbols until the future input samples compensate for the error. Suppose that the decision device causes an error in estimating the symbol u(n).. W(n+1) = W(n) + µe^(n)V(n) (4.u(n)]T.8). The detrimental potential of error propagation is the most serious drawback for decision feedback equalization. x(n − k1 − 1). x(n − 1). . Consider that the DFE is updated with a recursive algorithm the feed forward filter weights and feedback filter weights can be jointly adapted by the LMS algorithm on a common error signal eˆ(n) as shown in (4..Figure 4. u(n −k2 − l). In effect. This is called the error propagation which will cause a burst of errors . k1 and k2 represent the feed forward and feedback filter tap lengths respectively..6: Decision Feedback Equalizer Combines with the feed forward filter. the DFE is described as being a non-linear equalizer because the decision device is non-linear.. the function of the feedback filter is to subtract the ISI produced by previously detected symbols from the estimates of future samples. w1(n).8) where eˆ(n) = u(n) − y(n) and V (n) = [x(n). . wk1+k2−1(n)]T .... The feed forward and feedback filter weight vectors are written in a joint vector as W (n) = [w0(n).. the DFE structure is still a linear combiner [59] . . Traditionally.

4. The performance of the proposed equalizer has been evaluated and has been compared with its LMS based counter part. Other impairments like thermal noise.4.and the adaptation loop is also linear. The Least-Mean-Square (LMS). Again the disadvantages or drawbacks of these derivative based algorithms have been discussed in Chapter-3. However they suffer from long training time and undesirable local minima during training. EQUALIZATION USING GA: High speed data transmission over communication channels distorts the transmitted signals in both amplitude and phase due to presence of Inter Symbol Interference (ISI). It has therefore been described as a linear equalizer structure 4. This algorithm has been suitably used to update the weights of the equalizer. However being a population based algorithm. Adaptive equalization of the digital channels at the receiver removes/reduces the effects of such ISIs and attempts to recover the transmitted symbols. Basically an equalizer is a filter which is placed in cascade with the transmitter and receiver with the aim to have an inverse transfer function of that of the channel in order to augment accuracy of reception. In the present chapter we propose a new adaptive channel equalizer using Genetic Algorithm (GA) optimization technique which is essentially a derivative free optimization tool. the standard Genetic Algorithm (SGA) suffers from slower convergence rate. Recursive-Least-Square (RLS) and Multilayer perceptron (MLP) based equalizers aim to minimize the ISI present in the channels particularly for nonlinear channels.1 FORMULATION OF CHANNEL EQUALIZATION PROCESS AS AN OPTIMIZATION PROBLEM: [60] . 4. impulse noise and cross talk also cause further distortions to the received symbols.

A typical diagram of a channel equalizer is shown in Fig. At any kth instant the equalizer output is given by N. n=0. The error signal e(k).1 n=0 Where N is order of the equalizer filter.N-1 are taken to be random values within certain bound .1.9) In LSM type learning algorithm e2(k) instead of e(k) is taken as the cost function for deriving the steepest descent algorithm because e2(k) always positive and represents the instantaneous power of difference signal.Y (k) (4.8) depending upon N is odd or even .In actual practice m is usually taken as N +1 2 N or 2 Y (k) = å x(n + k).7. The desired signal d(k) at the kth instant is formed by delaying the input sequence x(n+k) by m samples . Subsequently these weights are updated by GA based adaptive rules. at the kth instant is given by e(k) = d(k) .4. In adaptive equalizers the parameters which can vary during training are filter weights. That is d(k) =x(n+k-m) In the beginning of training the initial weights hn(0).An adaptive channel equalizer is basically an adaptive tape-delay digital filter with its order higher than that of channel filter . In developing the GA based algorithm a set of chromosomes within a bound are selected. The objective of an adaptive algorithm is to change the filter weights iteratively so that e 2(k) is minimized iteratively and subsequently reduced to zero . each representing the weight vector of the equalizer [61] .………….h(k) (4.

4.7 A 8-tap Adaptive digital channel equalizer.Random Binary input Z-1 x(7+k) h0(k) х x(6+k) Z-1 h1(k) х х х х х х e(k) x(5+k) ) x(4+k) h2(k) Z-1 h3(k) Z -1 x(3+k) h4(k) Σ y(k) + Σ - d(k) Z-1 x(2+k) h5(k) Z-1 x(1+k) Z-1 h6(k) x(k) h7(k) х Adaptive Algorithm Z-m Fig. [62] .

Then GA starts from the initial random strings to proceed repeatedly from generation to generation through three genetic operators. [63] . A flow chart of genetic based adaptive algorithm for channel equalizer. A flow chart of a genetic based adaptive algorithm for channel equalization is shown in fig. Initialize the population (Random set of filter weights of the equalizer) Create a new generation through crossover and mutation operators. (MSE of the equalizer) Apply selection (sort based on decreasing MSE) Terminate (When MMSE is reached) Yes Stop No Fig. Evaluate the fitness for a whole population..4. The selection procedure reproduces highly fitted individuals who provide minimum mean square error (MMSE) at the equalizer output.8.4.8.

STEPWISE REPRESENTATION OF GA BASED CHANNEL EQUALIZATION ALGORITHM: The updating of the weights of the GA based equalizer is carried out using GA rule as outlined in the following steps: 1. where N is the number of parameter of the model . i MSE (n) = i = 1 k 7.4. In this K number of desired signal are produced by feeding all the K input sample. 9. 8. MMSE (expressed in dB ) is stored which show the learning behavior of the adaptive model from generation to generation .4. Since the objective is to minimize MSE(n).specific level the optimization is stopped. 3. 11. n=1 to N the GA based optimization is used .2. which represent the desired filter coefficients of the equalizer. SIMULATIONS: [64] . The crossover. Each chromosome constitutes NL number of random binary bits. As shown in fig 4. Each of the input samples is passed through the channel and then contaminated with the additive noise of known strength . 4. Each of the input signal is delayed which acts as desired signal. The resultant signal is passed through the equalizer. 5.This is repeated for N times. The mean square error (MSE) for a given group of parameter (corresponding to nth chromosome ) is determine by using the relation k å e2 . 2. Each of the desired output is compared with corresponding channel output and K error are produced . The structure of equalizer is a FIR system whose coefficients are initially chosen from a population of M chromosomes. mutation and selection operator are sequentially carried out. In each generation the minimum MSE.5. When the MMSE has reached a pre.8 is a GA based adaptive equalizer connected in series with channel. 10. At this step all the chromosomes attend almost identical genes. 6. each sequential bits group of L bits represents one coefficient of the adaptive model. Generally K (≥1000) number of input signal which are random binary in nature 4.

2090.7 is simulated where the channel coefficients are adapted based on the LMS and GA. The four different channel (2 linear and 2 non linear) and the additive noise in the channel are -30 dB and -20dB are used for simulation. In the simulation study N=8 has been 2 2 taken.4. The following channel models are used a.0.3040] b. [65] . Nonlinear Channels a a (i) NCH1: b(k) = a ( k ) + 0.2 2 ( k ) .5cos( pa(k)) ( Where a(k) is the output of linear channel and bk) is the output of Non linear channel. Linear channel coefficients (i) CH1: [0. 0.9030.1 3 ( k ) + 0.9950. 0. The convergence characteristics and bit error (BER) plots are obtained from simulation for different channel in different noisy conditions using LMS and GA are shown in the following figures.3040. 0.0.4 is used in the simulation for GA. The desired signal is generated by delaying the input binary sequence by m samples where m= N N +1 or depending upon N is even or odd.1 3 ( k ) a a (ii) NCH2: b( k ) = a ( k ) + 0. The block diagram of Fig.2090] (ii) CH2: [0.2 2 ( k ) . 0.In this section we carry out the simulation study of the new channel equalizer. The algorithm proposed in section-4.

9 Plot of convergence characteristic of linear channel CH1 at -30dB using LMS CH:[0.0.10 Plot of convergence characteristic of linear channel CH2 at -30dB using LMS [66] .2090].0.CH:[0.3040.4.4.0.NL=0 0 -10 -20 MSE in dB -30 -40 -50 -60 0 500 1000 1500 2000 2500 3000 3500 No.NL=0 0 -5 -10 -15 MSE in dB -20 -25 -30 -35 -40 0 500 1000 1500 2000 2500 3000 3500 No.9950. of iteration/generation 4000 4500 5000 Fig.9030.0.2090.3040]. of iteration/generation 4000 4500 5000 Fig.

of generation 400 500 600 Fig.CH:[0.2090.9950.0.2090]. of generation 400 500 600 Fig.9030.3040].12 Plot of convergence characteristic of linear channel CH2 using GA [67] .NL=0 0 NSR=-20dB NSR=-30dB -5 MSE in dB -10 -15 0 100 200 300 No.4.4.3040.NL=0 0 -2 -4 -6 MSE in dB -8 -10 -12 -14 -16 -18 NSR=-20dB NSR=-30dB 0 100 200 300 No.0.0.11 Plot of convergence characteristic of linear channel CH1 using GA CH:[0.0.

2090].NL=0 LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 5 10 15 SNR in dB 20 25 Fig. [68] .14 Comparison of BER of linear channel CH2 between LMS and GA based equalizer at -30dB noise.0.9950.0.9030.CH:[0.0.13 Comparision of BER of linear channel CH1 between LMS and GA based equalizer at -30dB noise.2090.0.3040].NL=0 GA LMS 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 2 4 6 8 10 12 SNR in dB 14 16 18 20 Fig.4.3040.4. CH:[0.

9030.16 Plot of convergence characteristic of Non Linear channel NCH1 for CH2 using LMS.2*(y. 2)-0.15 Plot of convergence characteristic of Non Linear channel NCH1 for CH1 using LMS.3040]. CH:[0.0.4.0.9950.2*(y.NL:y=y+0.1*(y.3040. of iteration/generation 4000 4500 5000 Fig.CH:[0.NL:y=y+0.1*(y. 3) 0 -5 -10 MSE in dB -15 -20 -25 -30 0 500 1000 1500 2000 2500 3000 3500 No. [69] .2090.0. 2)-0.4. 3) 0 -5 -10 MSE in dB -15 -20 -25 0 500 1000 1500 2000 2500 3000 3500 No.0. of iteration/generation 4000 4500 5000 Fig.2090].

17 Plot of convergence characteristic of Non Linear channel NCH1 for CH1 using GA. CH:[0.NL:y=y+0.0. 2)-0.9030. of generation 400 500 600 Fig.0.2090].CH:[0.9950. of generation 400 500 600 Fig.NL:y=y+0.18 Plot of convergence characteristic of Non Linear channel NCH1 for CH2 using GA [70] .2090.3040].4.1*(y.0.0. 2)-0.2*(y.1*(y.2*(y.4. 3) 0 -2 -4 -6 MSE in dB -8 -10 -12 -14 -16 NSR=-20dB NSR=-30dB 0 100 200 300 No.3040. 3) 0 NSR=-20dB NSR=-30dB -2 -4 MSE in dB -6 -8 -10 -12 -14 0 100 200 300 No.

9030. 10 0 CH:[0.3040].0.0.NL:y=y+0.2)-0.0.3) LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 2 4 6 8 10 12 14 SNR in dB 16 18 20 22 Fig.9950.1*(y.4.20 Comparison of BER of linear channel NCH1 for CH2 between LMS and GA based equalizer at -30dB noise.2*(y. 3) LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 0 5 10 15 SNR in dB 20 25 30 Fig.2090].CH:[0.0.19 Comparison of BER of linear channel NCH1 for CH1 between LMS and GA based equalizer at -30dB noise.2090.4. 2)-0.1*(y.NL:y=y+0.2*(y. [71] .3040.

3040].0.4.4.2*(y.21 Plot of convergence characteristic of Non Linear channel NCH2 for CH1 using LMS CH:[0.NL:y=y+0.5*cos(pi*y) 0 -2 -4 -6 -8 MSE in dB -10 -12 -14 -16 -18 -20 0 500 1000 1500 2000 2500 3000 3500 No.1*(y.5*cos(pi*y) 0 -2 -4 -6 -8 MSE in dB -10 -12 -14 -16 -18 -20 0 500 1000 1500 2000 2500 3000 3500 No.9030.2)-0.0.3)+0.9950.3040. of iteration/generation 4000 4500 5000 Fig.0.0.2090. [72] .1*(y.22 Plot of convergence characteristic of Non Linear channel NCH2 for CH2 using LMS.2*(y. 2)-0.2090].CH:[0. of iteration/generation 4000 4500 5000 Fig.NL:y=y+0. 3)+0.

3)+0.23 Plot of convergence characteristic of Non Linear channel NCH2 for CH1 using GA. 2)-0.3040]. [73] .4.9950. 3)+0.2090.1*(y.4.2*(y.2090].CH:[0.5*cos(pi*y) 0 NSR=-30dB NSR=-20dB -2 -4 MSE in dB -6 -8 -10 -12 0 100 200 300 No.0.0. 2)-0.24 Plot of convergence characteristic of Non Linear channel NCH2 for CH2 using GA.1*(y.0.3040. CH:[0.5*cos(pi*y) 0 NSR=-30dB NSR=-20dB -5 MSE in dB -10 -15 0 100 200 300 No.NL:y=y+0. of generation 400 500 600 Fig.NL:y=y+0.2*(y.9030.0. of generation 400 500 600 Fig.

[74] .NL:y=y+0.26 Comparison of BER of Non linear channel NCH2 for CH2 between LMS and GA based equalizer at -30dB noise.25 Comparison of BER of Non linear channel NCH2 for CH1 between LMS and GA based equalizer at -30dB noise.1*(y.CH:[0.0.NL:y=y+0.4.0.2)-0.3040].2090.5*cos(pi*y) LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 2 4 6 8 10 12 14 SNR in dB 16 18 20 22 Fig.5*cos(pi*y) GA LMS probability of error 10 -1 10 -2 2 4 6 8 10 SNR in dB 12 14 16 Fig.3)+0.2*(y.2090].2)-0.4.9950.2*(y.3040.9030. CH:[0.0.3)+0.1*(y.0.

24 & 4. Similarly the BER plot for channel a(i & ii) and b(i) are shown in Fig.14 &4.4.4.19.4. Similarly the BER plot for channel a(i & ii) and b(i) are shown in Fig. The convergence characteristics for channel a(i & ii) and b(ii ) are shown in Fig. The convergence characteristics for channel a(i & ii) and b(i ) are shown in Fig.4. [75] .18 & 4. in Fig.4.12 using LMS and GA respectively for Linear Channel a(ii).4.23 using GA.4.9 & 4.13 and for channel a(ii) is shown in Fig.4.4.17 using GA.10 & 4.in Fig.14. This is true for both linear and nonlinear channels. It is observed from the convergence characteristics and BER plot that the GA based equalizers outperform the corresponding LMS counterparts.11 using LMS and GA respectively for Linear Channel a(i).16 & 4. Under high noise conditions the results of GA based equalizers are distinctly better.15 using LMS .25.6 RESULTS AND DISCUSSIONS: The convergence characteristics is obtained from simulation are shown in Fig.4. .Similarly the bit error rate (BER) plot for channel a(i) is shown in Fig.4.in Fig.22 & 4.21 using LMS .20 &4.

2. required binary bits. The three crucial control parameters involved in DE are population size (NP).From that chapter we conclude that Linear system is performed well by using LMS technique and Nonlinear system is performed well by using GA technique.DE is an efficient and powerful population based stochastic search technique for solving optimization problems over continuous space. 5. [76] . It has been successfully applied to diverse fields such as mechanical Engineering. But GA have more converging time. Fig. DE BASED OPTIMIZATION The DE is based on the mechanics of natural selection and the evolutionary behavior of biological system. which has been widely applied in many scientific and engineering fields. scaling factor (F). there exist many trial vector generation strategies out of which a few may be suitable for solving a particular problem. more computational complexity.CHAPTER-5 ADAPTIVE SYSTEM IDENTIFICATION USING DIFFERENTIAL EVOLUTION 5. FLANN. 5. etc are used. pattern recognition. and crossover rate (CR) may significantly influence the optimization performance of the DE. To improve the identification performance of nonlinear systems various techniques such as ANN. INTRODUCTION: The identification of linear and nonlinear system are performed in chapter-3 by using LMS and GA technique .1 shows the basic operation of DE. In this chapter we propose a novel model based on DE technique for identification.1. Communication. the success of DE in solving a specific problem crucially depends on appropriately choosing trial vector generation strategy and their associated control parameter values. RBF. In DE. However.

.G { }.G = xi1 .Population/ initialisation Mutation Crossover Selection Fitness Yes nN Fig.. in the current population.2.G .called target vector. The initial NP D-dimensional parameter vectors. For each target vector Xi.1 Block Diagram of Differential Evolution Algorithm cycle..2.xiD .1 OPERATORS OF DE: (a) Population: Parameter vectors in a population are randomly initialized and evaluated using the fitness function. No 5..G with respect to each individual Xi. i = 1.so...D is dimension and G is generation. (b) Mutation: The mutation operation produces a mutant or noisy vector Vi.G at [77] .G . so-called individuals is X i.……….5.NP Where NP is number of population vectors ...

.G uiG .G) in the current population.uiD ÷ ç . DE employs the binomial (uniform) crossover defined as follows: vj . (d)Selection Operation: In selection operation the objective function values of each trial vector f (Ui.. which controls the fraction of parameter values copied from the mutant vector. ì ï ï ï ï ï ï í ï ï ï ï ï ï î (5...G .çX r i .ui2 .its associated mutant vector Vi. If the trial vector has less or equal objective function value than the corresponding target vector. xj .... 2…….G) is compared to that of its corresponding target vector f (Xi.G ÷ ÷ ç ÷ ç ÷ 1 ç è 2 3 ÷ ø æ ö { } (5. D). And jrand is a randomly chosen integer in the range [1..G to generate a trial vector: æ è ö ø U i.G + F .i.viD can be generated . ÷ ç G .if (randj [o. which are also different from the index i.D..1)£CR)or (j =j rand) j = i.G .the generation G..G ÷ (5.G via certain mutation strategy..G = çui1 .1) i i i Where r1.. G Where J=1.. NP]. 1).G and its corresponding mutant vector Vi.r2.G ..e ÷ ç Vi.G = X r i .G = vi1 .G .vi2 .X ri . F is a positive control parameter for scaling the difference vector called as scaling parameter..r3 are mutually exclusive integers randomly generated within the range [1....otherwise i. the trial vector will [78] .3) The crossover rate (CR) is a user specified constant within the range [0.2) In the basic version.. (c) Crossover operation: After mutation crossover operation is applied to each pair of the target vector X i.

The coefficients (â) of the system are initially chosen from population of NL target vector. Each of the input samples is also passed through the model using each target vector as model parameters and NP sets of K estimated output are obtained.4) The steps (b. 5.G.G +1 = Xi. Generate k(=500) number of input signal samples each of which is having zero mean and uniformly distributed between -2√3 to +2√3 and having a variance of unity. Otherwise. As shown in fig.The resultant signal acts like the desired signal .otherwise i.G . iv. the target vector will remain in the population for the next generation. Each target vector constitutes D number of random Number. [79] .G ) X i.c.2 an unknown static dynamic system to be identified is connected is parallel with an adaptive model to be developed using DE.replace the target vector and enter the population of the next generation. in this way k number of desired signals are produced by feeding all the k input samples.d) are repeated generation after generation until some specific termination criteria are satisfied. iii. ii.iff ( U (5. STEPWISE REPRESENTATION OF DE BASED ADAPTIVE SYSTEM IDENTIFICATION ALGORITHM: i. Each of the input samples is passed through the plant P(Z) and the contaminated with the additive noise of known strength . each random Number represent one coefficient of the adaptive model.3. v. G Ui. The selection operation can be expressed as follows: ì ï ï ï ï ï ï í ï ï ï ï ï ï î )£ f (Xi.3. where D is the number of parameters of the model.

At this step all the individuals attend almost identical parameters. In each generation the minimum MSE. xi. Each of the desired output is compared with corresponding estimated output and K errors are produced. Case-1 (Linear System) [80] . The block diagram shown in the fig. ix. The mutation operation. which represent the estimated parameters of the developed model. Since the objective is to minimize MSE(m).5) k This is repeated for NP times vii.m=1 to NP the GA based optimization is used.2. crossover operation and selection operation are sequentially carried out following the steps as given in section-5. MMSE is obtained and plotted against generation to show the learning characteristics. The learning process is stopped when MMSE reaches the minimum level. The mean square error (MSE) for set of parameters (corresponding to NPth target vector) is determined by using relation. x.4.2 is used for simulation study. SIMULATION STUDIES: To demonstrate the performance of the proposed DE based approach numerous simulation studied are carried out of several linear and non linear system. 5. viii. M En ) = S ( å e2 i i =1 k (5.GA structure. The performance of the proposed structure is compared with corresponding LMS.3.vi.

The resultant signal y is used as the desired on the training signal.Computer simulation result of two different nonlinear system are presented in this case the actual system Experiment -3: yn (k) = tanh{y (k)} Experiment -4: yn (k) = y (k) + 0.To demonstrate this observation training carried out using DE based algorithm . Experiment-1: H (z) =0.9950Z-1 + 0. In case of nonlinear system the parameter of two system do not match .2y2 (k) – 0.2090+ 0.2600 Z-2 The output of the system is contaminated with white Gaussian noise of different strengths of -20 db and -30db. it has been seen that in linear system the actual and estimated parameters are same. [81] .A unit variance random system uniform signal lying in the range of -2√ 3 to +2√3 is applied to known the system having transfer function.2600 + 0.however the responses of the actual and adaptive model match .2090 Z-2 and Experiment-2: H (z) =0. By adjusting scaling factor (f) and crossover rate (CR). The same random input is also applied to the DE based adaptive model having the same linear combiner structure as that of H (z) but with random initial weights.1y3 (k) Where y (k) is the output of the linear system and yn (k) is the output of nonlinear system.9300 Z-1 + 0. Case-2 (Non-Linear System) In this simulation the actual is assume to be non linear in nature .

0.2090.0.9950.5.NL:y=tanhy 0 NSR=-30dB NSR=-20dB -5 -10 MSE in dB -15 -20 -25 -30 0 10 20 30 40 50 60 generation 70 80 90 100 Fig.2. Learning Characteristics of DE based Non linear system identification at -20dBNSR and -30dBNSR (experiment-3) [82] .CH:[0.2090].

4 -0.5.2600.4 0.5.0.NL:y=y+0.2 output 0 -0.2090].NL:y=tanhy 0 NSR=-30dB NSR=-20dB -5 -10 MSE in dB -15 -20 -25 -30 0 10 20 30 40 50 60 generation 70 80 90 100 Fig. Learning Characteristics of DE based Non linear system identification at -20dBNSR and -30dBNSR (experiment-3) [83] .3. CH:[0.2*(y.9950. Comparison of output response of (Experiment-3) at -3odB NSR.3) 0.6 Actual DE LMS 0.9300.2600].CH:[0.0.4.2 -0.0.2090.6 0 5 10 15 20 25 30 35 40 45 50 Fig.1*(y.0.2)-0.

5. RESULTS AND DISCUSSIONS: Fig.19 with Fig.6 -0.8 0 5 10 15 20 25 30 35 40 45 50 Fig.5.0.3.3 & 5. 5.2 0 plot -0.9300.2600]. The output response of nonlinear system (experiment-3) of DE based is better than the LMS based and GA based because of response of DE is closer to the desired response by comparing the Fig. Fig.4 0.2 & 5. Comparison of output response of (Experiment-3) at -3odB NSR.5 shows the output response of Non linear channel Experiment-3 followed by Experiment -1 and Experiment-2 respectively.5.2 -0.13 with Fig.NL:y=tanhy 0.4 -0.3.5.6 Actual DE LMS 0.0.3 and Fig.5.CH:[0.5.5.4 shows the learning characteristics of Non linear channel Experiment-3 followed by Experiment -1 and Experiment-2 respectively. [84] .5.2600.

FLANN. required binary bits. more computational complexity. To improve the equalization performance of nonlinear systems various techniques such as ANN.1.From that chapter we conclude that Linear system is performed well by using LMS technique and Nonlinear system is performed well by using GA technique. But GA has more converging time. RBF. [85] . INTRODUCTION: The equalization of linear and nonlinear system are performed in chapter-3 by using LMS and GA technique .CHAPTER-6 ADAPTIVE CHANNEL EQUALIZATION USING DIFFERENTIAL EVOLUTION 6. etc are used.

In this chapter we propose a novel model based on DE technique for equalization. The resultant signal is passed through the equalizer. The structure of equalizer is a FIR system whose coefficients are initially chosen from a population of NP target vectors.This is repeated for NP times. where D is the number of parameter of the model.8 is a DE based adaptive equalizer connected in series with channel. 6. the success of DE in solving a specific problem crucially depends on appropriately choosing trial vector generation strategy and their associated control parameter values. ii. STEPWISE PRESENTATION OF DE BASED CHANNEL EQUALIZATION ALGORITHM The updating of the weights of the DE based equalizer is carried out using GA rule as outlined in the following steps: i.2. which has been widely applied in many scientific and engineering fields. Each target vector constitutes D number of random binary number. In this K number of desired signal are produced by feeding the entire K input sample. i MSE (n) = i = 1 k [86] . vi. Each of the desired output is compared with corresponding channel output and K error is produced. iv. The mean square error (MSE) for a given group of parameter (corresponding to nth chromosome ) is determine by using the relation k å e2 . However. v. Each of the input signal is delayed which acts as desired signal. Generally K (≥1000) number of input signal which are random binary in nature Each of the input samples is passed through the channel and then contaminated with the additive noise of known strength.DE is an efficient and powerful population based stochastic search technique for solving optimization problems over continuous space. Each random number represents one coefficient of the adaptive model. As shown in fig 4. iii.

0. 0.5cos( pa(k)) ( Where a(k) the output of is linear channel and bk) is the output of Non linear channel. 0. 0. crossover and selection operator are sequentially carried out following the steps as given in section 5. When the MMSE has reached a pre. MMSE (expressed in dB) is stored which show the learning behavior of the adaptive model from generation to generation. At this step all the target vectors attend almost identical parameters. n=1 to NP the DE based optimization is used The mutation. ix.0.specific level the optimization is stopped. The following channel models are used b.2 2 ( k ) . which represent the desired filter coefficients of the equalizer.4.1 3 ( k ) a a (ii) NCH2: b( k ) = a ( k ) + 0.7 is simulated where the channel coefficients are adapted based on the LMS. xi. The algorithm proposed in section-4. SIMULATIONS: In this section we carry out the simulation study of the new channel equalizer. The four different channel (2 linear and 2 non linear) and the additive noise in the channel are -30 dB and -20dB are used for simulation. viii.2090. Since the objective is to minimize MSE(n).2 2 ( k ) . x. In each generation the minimum MSE. GA and DE.3040.3.vii. 6.2090] (ii) CH2: [0. Linear channel coefficients (i) CH1: [0. The block diagram of Fig. [87] .4 is used in the simulation for DE.3040] b.2.1 3 ( k ) + 0.9950. 0. Nonlinear Channels a a (i) NCH1: b(k) = a ( k ) + 0.9030.

The desired signal is generated by delaying the input binary sequence by m samples where m= N N +1 or depending upon N is even or odd.0.9030.NL=0 2 0 -2 MSE in dB -4 -6 -8 -10 -12 0 50 100 150 200 250 300 generation 350 400 450 500 Fig.1. plot for convergence characteristic of linear channel CH2 at -30dB NSR. CH:[0. The convergence characteristics and bit error (BER) plots are obtained from simulation for different channel in different noisy conditions using LMS .6. In the simulation study N=8 has been 2 2 taken.3040].3040.0. [88] .GA and DE are shown in the following figures.

plot for convergence characteristic of Nonlinear channel NCH1 using CH1 at -30dB NSR.2. 250 300 generation 350 400 450 500 Fig.2*(y. CH:[0.1*(y.3040.0.6.2090.9950.NL:y=y+0.3.0.3) 2 0 -2 -4 MSE in dB -6 -8 -10 -12 -14 0 50 100 150 200 250 300 generation 350 400 450 500 Fig.3040].2)-0.6.CH:[0.1*(y. plot for convergence characteristic of Nonlinear channel NCH1 using CH2 at -30dB and -20dB NSR.2)-0.9030.3) 2 NSR=20db NSR=-30dB 0 -2 MSE in dB -4 -6 -8 -10 0 50 100 150 200 .2090].0.2*(y.NL:y=y+0. [89] .0.

0.9030.0.6.3040].0. CH:[0.NL=0 DE LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 2 4 6 8 10 SNR in dB 12 14 16 Fig. Comparison of BER plot of Nonlinear channel NCH2 using CH2 between LMS.5. [90] .3040. Comparison of BER plot of linear channel CH2 between LMS.3040].CH:[0. GA and DE at -30dB NSR.6.0.NL:y=tanhy LMS GA DE 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 2 4 6 8 10 12 SNR in dB 14 16 18 20 22 Fig.4.9030.3040. GA and DE at -30dB NSR.

6.NL:y=y+0.2)-0.3040.0.6.9030. Fig. GA and DE at -30dB NSR.7.0. Comparison of BER plot of Nonlinear channel NCH1 using CH2 between LMS.1*(y. 2)-0.1*(y.9030.3) DE LMS GA 10 -1 probability of error 10 -2 10 -3 10 -4 10 -5 0 5 10 . Comparison of BER plot of Nonlinear channel NCH1 using CH2 between LMS.6. GA and DE at -20dB NSR.3040]. [91] .10 0 CH:[0.3040.NL:y=y+y*(y. CH:[0.0.2*(y.3) DE GA LMS 15 SNR in dB 20 25 30 10 -1 probability of error 10 -2 10 -3 10 -4 2 4 6 8 10 12 SNR in dB 14 16 18 20 22 Fig.3040].0.

Using the same channels and same the same noise conditions the corresponding results are obtained for the LMS.5. The GA is then successfully used in an iterative manner to optimize the coefficients of linear and non linear adaptive models.7 that the DE based equalizers outperform the corresponding LMS and GA counterparts.6.6. 6. The problem of non linear system identification has been formulated as an MSE minimization problem.2 &6. Fig. It is observed from the plots of Fig. This is true for both linear and nonlinear channels. These are used for comparison.6.7 shows the comparison of BER plot between LMS. Thus GA is an useful alternative to the LMS algorithm for non linear system identification.1 shows the convergence characteristic of linear channel CH2 at -30dB NSR. REFERENCES AND SCOPE FOR FUTURE WORK: 7.1 COCLUSIONS The chapter-3 presents a novel GA based adaptive model identification of dynamic non linear systems. GA and DE of Nonlinear channel NCH1 using CH2 at -30dB and -20dB respectively. Fig.4 RESULTS AND DISCUSSIONS: Fig.6.3 shows the convergence characteristic of nonlinear channel NCH1 using CH1 at -30dB and using CH2 at -30dB & -20dB respectively. GA and DE of linear channel and Nonlinear channel NCH2 using CH2 at -30dB NSR respectively.6.6. GA based equalizers.6 &6.6 &6.4 & 6. 6. The chapter-4 proposes a novel adaptive digital channel equalizer using GA based optimization.Fig.5 shows the comparison of BER plot between LMS. CHAPTER-7 COCLUSIONS.4. Through computer simulation it is shown that the GA based equalizer yield superior [92] . It is demonstrated through simulations that the proposed approach exbits superior performance compared to its LMS counterpart in identifying both linear and non linear systems under various additive Gaussian noise conditions.

36. 2000. Reynolds. Aug. C. P. [8] G. 2001. New Jersey. 4-26. January 1990. pp. IEEE Trans. Siu and C. The DE is then successfully used in an iterative manner to optimize the coefficients of linear and non linear adaptive models. Through simulation it is shown that the DE based equalizers yield superior performance compared to LMS and GA counterpart. IEEE Trans. “Adaptive Signal Processing” Prentice-Hall. Panda. “Active mitigation of nonlinear noise processes using a novel filtered-s LMS algorithm”. Robinson and T. no. A. issue 1. vol. Serpedin. Narendra and K. Durrani. 533-580. Engle-wood Cliffs. [4] G. 829-834. Das and G. Inc. Pachter and O. It is demonstrated through simulations that the proposed approach exbits superior performance compared to its LMS and GA counterpart in identifying both linear and non linear systems under various additive Gaussian noise conditions. no. This observation is true for both linear and nonlinear channels. 83. R. Parthasarathy. Thus DE is an useful alternative to the LMS or GA algorithm for non linear system identification. 1877-1884. Aug. on Neural Networks. “Identification of a discrete time dynamical system”. “A bibliography on nonlinear system identification”. 49. pp. Sterns.D. “An intelligent pressure sensor using neural networks”. [6] D. 2000. C.performance compared to its LMS counterpart. [3] M. S. vol. J. pp. The problem of non linear system identification has been formulated as an MSE minimization problem. on Instrumentation and Measurement. B. F. The chapter-5 presents a novel DE based adaptive model identification of linear and dynamic non linear systems. 1. pp. Aerospace Electronic System. Signal Processing. Geophysical Signal Processing. issue 4. Prentice-Hall. 1985. issue 3. A. 313-322. 3. 1991. vol. IEEE Trans. signal processing. This observation is true for both linear and nonlinear channels. 1986. Cowan. May 2004. pp. 39. vol. NJ. 12. [93] . [2] J. on Speech and Audio Processing. vol. “The application of nonlinear structures to the reconstruction of binary signals”. [7] B. S. The chapter-6 proposes a novel adaptive digital channel equalizer using DE based optimization. Englewood Cliffs. pp. Giannakis and E. Widrow and S. [5] E. Kot and G. vol. Gibson. Patra. N. 7. “Identification and control of dynamical systems using neural networks”. IEEE Trans. IEEE Trans. 212-225. 8.2 REFERENCES: [1] K. Panda.

[9] R. W. Lucky, Techniques for adaptive equalization of digital communication systems, Bell Sys.Tech. J., 45, 255-286, Feb. 1966. [10] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high density magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005. [11] L. J. Griffiths, F. R. Smolka and L. D. Trenbly, “Adaptive deconvolution : a new technique for processing time varying seismic data”, Geohysics, June 1977. [12] B. Widrow, J. M. McCool, M. G. Larimore and C. R. Johnson, Jr., :Stationary and nonstationary learning characteristics of the LMS adaptive filter”, Proc. IEEE, vol. 64, no.8, pp. 1151-1162, Aug., 1976. [13] B. Friedlander and M. Morf, “Least-squares algorithms for adaptive linear phase filtering”, IEEE Trans., vol. ASSP-30, no. 3, pp. 381-390, June 1982. [14] S. A. White, “An adaptive recursive digital filter”, Proc. 9th Asilomar Conf. Circuits Syst. Comput., p. 21, Nov. 1975. [15] John J. Shynk, “Adaptive IIR filtering”, IEEE ASSP Magazine, April 1989, pp. 4-21. [16] A. E. Eiben and J. E. Smith, “Introduction to Evolutionary Computing”, Springer, 2003, ISBN 3-540-40184-9. [17] Andries Engelbrecht, “Computational Intelligence : An introduction”, Wiley & Sons, ISBN 0-470-84870-7. [18] D.E.Goldberg, “Genetic algorithms in search, optimization and machine learning”, Addition-Wesley,1989. [19]A.K.Qin, V.L.Huang, and P.N.Suganthan,"Differential Evolution Algorithm with Strategy Adaptation for Global Numerical Optimization" IEEE Trans. On Evolution computation,VOL.13,No.2,April.2009. [20] Konar A (2005), Computational Intelligence: Principles, Techniques and Applications, Springer, Berlin Heidelberg New York. [21]. Holland JH (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor. [22]. Goldberg DE (1975), Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA. [23]. Kennedy J, Eberhart R and Shi Y (2001), Swarm Intelligence, Morgan Kaufmann, Los Altos, CA. [94]

[24]. Kennedy J and Eberhart R (1995), Particle Swarm Optimization, In Proceedings of IEEE International Conference on Neural Networks, pp. 1942–1948. [25]. Storn R and Price K (1997), Differential Evolution – A Simple and Efficient Heuristic for Global Optimization Over Continuous Spaces, Journal of Global Optimization, 11(4), 341–359. [26]. Venter G and Sobieszczanski-Sobieski J (2003), Particle Swarm Optimization, AIAA Journal, 41(8), 1583–1589. [27]. Yao X, Liu Y, and Lin G (1999), Evolutionary Programming Made Faster, IEEE Transactions on Evolutionary Computation, 3(2), 82–102. [28]. Shi Y and Eberhart RC (1998), Parameter Selection in Particle Swarm Optimization, Evolutionary Programming VII, Springer, Lecture Notes in Computer Science 1447, 591–600. [29]. Das S, Konar A, Chakraborty UK (2005), Particle Swarm Optimization with aDifferentially Perturbed Velocity, ACM-SIGEVO Proceedings of GECCO’ 05,Washington D.C., pp. 991–998. [30]. van den Bergh F (1999), Particle Swarm Weight Initialization in Multi-Layer Perceptron Artificial Neural Networks, Development and Practice of Artificial Intelligence Techniques, Durban, South Africa, pp. 41–45. [31] B.Widrow and S.D.Stearns, Adaptive Signal Processing, Chapter-6, pp.99-166,Second Edition, Pearson. [32] S. Chan, S.A. Billings and P.M. Grant, "Nonlinear System identification using neural networks", Int. J.Contr.,Vol. 51,No.6, pp. 1191-1214, June 1990. [33] J.C. Patra, R.N. Pal, B. N. Chatterji and G. Panda, "Idntification of nonlinear dynamic systems using functional link artificial neural network", IEEE Trans. On Systems, Man and Cybernetics-part B: Cybernetics, vol.29,no.2, pp. 254-262, April 1999. [34] S.V.T Elanayar and Y.C. Shin, "Radial basis function naural network for approximation and estimation of nonlinear stochastic dynamic systems", IEEE.Trans. Neural Networks, vol.5, pp. 594-603, July 1994. [35] C.A. Belfoior & J.H. Park, Jr. "Decission Feedback Equalization", Proc, vol. 67, pp. 11431156, Aug 1979 [36] S.Siu, \Non-linear adaptive equalization based on multi-layer perceptron architecture," Ph.D. dissertation, University of Edinburgh, 1990.

[95]

[37] O.Macchi, Adaptive processing, the least mean squares approach with applications in transmission. West Sussex. England: John Wiley and Sons,1995. [38] R. W. Lucky, Techniques for adaptive equalization of digital communication systems,Bell Sys.Tech. J., 45, 255-286, Feb. 1966. [39] S. K. Nair and Jaekyun Moon, “A theoretical study of linear and nonlinear equalization in nonlinear magnetic storage channels”, IEEE Trans. on neural networks, vol. 8, no. 5, pp. 11061118, Sept. 1997. [40] J. C. Patra, A. C. Kot and G. Panda, “An intelligent pressure sensor using neural networks”, IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829-834, Aug. 2000.
[41] B. Widrow and E. Walach, Adaptive Inverse Control, Prentice-Hall, Upper Saddle River, NJ, 1996. [42] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, 1986.

[43] S. U. H Qureshi, Adaptive equalization, Proc. IEEE, 73(9), 1349-1387, Sept. 1985. [44] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high density magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005. [45] J. C. Patra, Wei Beng Poh, N. S. Chaudhari and Amitabha Das,” Nonlinear channel equalization with QAM signal using Chebyshev artificial neural network”, Proc. Of International joint conference on neural networks, Montreal, Canada, pp. 3214-3219, August 2005. [46] G. Panda, B. Majhi, D. Mohanty, A. Choubey and S. Mishra, “Development of Novel Digital Channel Equalizers using Genetic Algorithms”, Proc. of National Conference on Communication (NCC-2006), IIT Delhi, pp.117-121, 27-29, January, 2006

7.3 FUTURE WORK:
Comparison of Adaptive system identification and equalization using IIR system between GA and DE

[96]

[97] .

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.