Professional Documents
Culture Documents
CH 052
CH 052
A signal is a phenomenon that conveys information. Biomedical signals are signals, used in biomedical
fields, mainly for extracting information on a biologic system under investigation. The complete process
of information extraction may be as simple as a physician estimating the patient’s mean heart rate by
feeling, with the fingertips, the blood pressure pulse or as complex as analyzing the structure of internal
soft tissues by means of a complex CT machine.
Most often in biomedical applications (as in many other applications), the acquisition of the signal is
not sufficient. It is required to process the acquired signal to get the relevant information “buried” in it.
This may be due to the fact that the signal is noisy and thus must be “cleaned” (or in more professional
terminology, the signal has to be enhanced) or due to the fact that the relevant information is not “visible”
in the signal. In the latter case, we usually apply some transformation to enhance the required information.
Bioelectric
Action potential Microelectrodes 100 Hz–2 kHz 10 µV–100 mV Invasive measurement of cell
membrane potential
Electroneurogram (ENG) Needle electrode 100 Hz–1 kHz 5 µV–10 mV Potential of a nerve bundle
Electroretinogram (ERG) Microelectrode 0.2–200 Hz 0.5 µV–1 mV Evoked flash potential
Electro-oculogram (EOG) Surface electrodes dc–100 Hz 10 µV–5 mV Steady-corneal-retinal potential
Electroencephalogram
(EEG)
Surface Surface electrodes 0.5–100 Hz 2–100 µV Multichannel (6–32) scalp
potential
Delta range 0.5–4 Hz Young children, deep sleep and
pathologies
Theta range 4–8 Hz Temporal and central areas
during alert states
Alpha range 8–13 Hz Awake, relaxed, closed eyes
Beta range 13–22 Hz
Sleep spindles 6–15 Hz 50–100 µV Bursts of about 0.2 to 0.6 s
K-complexes 12–14 Hz 100–200 µV Bursts during moderate and
deep sleep
Evoked potentials (EP) Surface electrodes 0.1–20 µV Response of brain potential to
stimulus
Visual (VEP) 1–300 Hz 1–20 µV Occipital lobe recordings,
200-ms duration
Somatosensory (SEP) 2 Hz–3 kHz Sensory cortex
Auditory (AEP) 100 Hz–3 kHz 0.5–10 µV Vertex recordings
Electrocorticogram Needle electrodes 100 Hz–5 kHz Recordings from exposed
surface of brain
Electromyography (EMG)
Single-fiber (SFEMG) Needle electrode 500 Hz–10 kHz 1–10 µV Action potentials from single
muscle fiber
Motor unit action Needle electrode 5 Hz–10 kHz 100 µV–2 mV
potential (MUAP)
Surface EMG (SEMG) Surface electrodes
Skeletal muscle 2–500 Hz 50 µV–5 mV
Smooth muscle 0.01–1 Hz
Electrocardiogram (ECG) Surface electrodes 0.05–100 Hz 1–10 mV
High-Frequency ECG Surface electrodes 100 Hz–1 kHz 100 µV–2 mV Notchs and slus waveforms
superimposed on the ECG.
( ) ()
s m =s t
t =mTs
m = …, − 1, 0, 1,… (52.1)
where Ts is the sampling interval and fs = (2π/Ts ) is the sampling frequency. Further characteristic
classification, which applies to continuous as well as discrete signals, is described in Fig. 52.1.
We divide signals into two main groups: deterministic and stochastic signals. Deterministic signals are
signals that can be exactly described mathematically or graphically. If a signal is deterministic and its
mathematical description is given, it conveys no information. Real-world signals are never deterministic.
There is always some unknown and unpredictable noise added, some unpredictable change in the
parameters, and the underlying characteristics of the signal that render it nondeterministic. It is, however,
very often convenient to approximate or model the signal by means of a deterministic function.
An important family of deterministic signals is the periodic family. A periodic signal is a deterministic
signal that may be expressed by
() (
s t = s t + nT ) (52.2)
where n is an integer, and T is the period. The periodic signal consists of a basic wave shape with a
duration of T seconds. The basic wave shape repeats itself an infinite number of times on the time axis.
The simplest periodic signal is the sinusoidal signal. Complex periodic signals have more elaborate wave
shapes. Under some conditions, the blood pressure signal may be modeled by a complex periodic signal,
with the heart rate as its period and the blood pressure wave shape as its basic wave shape. This is, of
course, a very rough and inaccurate model.
Most deterministic functions are nonperiodic. It is sometimes worthwhile to consider an “almost
periodic” type of signal. The ECG signal can sometimes be considered “almost periodic.” The ECG’s RR
interval is never constant; in addition, the PQRST complex of one heartbeat is never exactly the same as
that of another beat. The signal is definitely nonperiodic. Under certain conditions, however, the RR
interval is almost constant, and one PQRST is almost the same as the other. The ECG may thus sometimes
be modeled as “almost periodic.”
distribution probabilities. Figure 52.2 depicts three sample functions of an ensemble. Note that at any
given time, the values of the sample functions are different.
Stochastic signals cannot be expressed exactly; they can be described only in terms of probabilities
which may be calculated over the ensemble.
Assuming a signal s(t), the Nth-order joint probability function
[( ) () ( ) ] (
P s t1 ≤ s1, s t 2 ≤ s2 , …, s t N ≤ s N = P s1, s2 ,…, s N ) (52.3)
is the joint probability that the signal at time ti will be less than or equal to Si and at time tj will be less
than or equal to Sj , etc. This joint probability describes the statistical behavior and intradependence of
the process.
It is very often useful to work with the derivative of the joint probability function; this derivative is
known as the joint probability density function (PDF):
( )
p s1, s2 ,…, s N =
∂N
∂s1 ∂s2 L∂s N [(
P s1, s2 ,…, s N )] (52.4)
The expectation of the function sn(t) is known as the nth-order moment. The first-order moment is
thus the expectation of the process. The nth-order moment is given by
{ ( )} = ∫ s p(s)ds
∞
E sn t n
(52.6)
−∞
( ) ∫ (s − m ) p(s)ds
∞
n n
µ n = E s − ms = s (52.7)
−∞
The second central moment is known as the variance (the square root of which is the standard
deviation). The variance is denoted by σ 2 :
( ) ∫ (s − m ) p(s)ds
∞
2 2
σ 2 = µ 2 = E s − ms = s (52.8)
−∞
The second-order joint moment is defined by the joint PDF. Of particular interest is the autocorrelation
function rss :
The cross-correlation function is defined as the second joint moment of the signal s at time t1, s(t1),
and the signal y at time t2, y(t2):
Stationary stochastic processes are processes whose statistics do not change in time. The expectation
and the variance (as with any other statistical mean) of a stationary process will be time-independent. The
autocorrelation function, for example, of a stationary process will thus be a function of the time difference
τ = t2 – t1 (one-dimensional function) rather than a function of t2 and t1 (two-dimensional function).
Ergodic stationary processes possess an important characteristic: Their statistical probability distribu-
tions (along the ensemble) equal those of their time distributions (along the time axis of any one of its
sample functions). For example, the correlation function of an ergodic process may be calculated by its
definition (along the ensemble) or along the time axis of any one of its sample functions:
( ) ∫ s(t )e { ( )}
∞
− jωt
Sω = dt = F s t (52.12)
−∞
where ω = 2πf is the angular frequency, and F{*} is the Fourier operator.
The inverse Fourier transform (IFT) is the operator that transforms a signal from the frequency domain
into the time domain:
() ∫ S(ω)e { ( )}
∞
1 jωt
st = dw = F −1 S ω (52.13)
2π −∞
() ()
Sω =Sω e ( )
jθ ω
(52.14)
where S(ω), the absolute value of the complex function, is the amplitude spectrum, and θ(ω), the phase
of the complex function, is the phase spectrum. The square of the absolute value, S(ω) 2, is termed the
power spectrum. The power spectrum of a signal describes the distribution of the signal’s power on the
frequency axis. A signal in which the power is limited to a finite range of the frequency axis is called a
band-limited signal. Figure 52.3 depicts an example of such a signal.
The signal in Fig. 52.3 is a band-limited signal; its power spectrum is limited to the frequency range
–ωmax ≤ ω ≤ ωmax . It is easy to show that if s(t) is real (which is the case in almost all applications), the
amplitude spectrum is an even function and the phase spectrum is an odd function.
Special attention must be given to stochastic signals. Applying the FT to a sample function would
provide a sample function on the frequency axis. The process may be described by the ensemble of
spectra. Another alternative to the frequency representation is to consider the correlation function of the
process. This function is deterministic. The FT may be applied to it, yielding a deterministic frequency
function. The FT of the correlation function is defined as the power spectral density function (PSD):
[ ( )] ( ) { ( )} ∫ r (τ)e
∞
− jωτ
PSD s t = Sss ω = F rss τ = ss dτ (52.15)
−∞
The PSD is used to describe stochastic signals; it describes the density of power on the frequency axis.
Note that since the autocorrelation function is an even function, the PSD is real; hence no phase spectrum
is required.
The EEG signal may serve as an example of the importance of the PSD in signal processing. When
processing the EEG, it is very helpful to use the PSD. It turns out that the power distribution of the EEG
changes according to the physiologic and psychological states of the subject. The PSD may thus serve as
a tool for the analysis and recognition of such states.
Very often we are interested in the relationship between two processes. This may be the case, for
example, when two sides of the brain are investigated by means of EEG signals. The time-domain
expression of such relationships is given by the cross-correlation function (Eq. 52.10). The frequency-
domain representation of this is given by the FT of the cross-correlation function, which is called the
cross-power spectral density function (C-PSD) or the cross-spectrum:
( ) { ( )}
Ssy ω = F rsy τ = Ssy ω e () ( )
jθsy ω
(52.16)
Note that we have assumed the signals s(t) and y(t) are stationary; hence the cross-correlation function
is not a function of time but of the time difference τ. Note also that unlike the autocorrelation function,
rsy(τ) is not even; hence its FT is not real. Both absolute value and phase are required.
It can be shown that the absolute value of the C-PSD is bounded:
() () ()
2
Ssy ω ≤ Sss ω S yy ω (52.17)
The absolute value information of the C-PSD may thus be normalized to provide the coherence function:
( ) ≤1
2
Ssy ω
γ sy2 =
S (ω )S (ω )
(52.18)
ss yy
The coherence function is used in a variety of biomedical applications. It has been used, for example, in
EEG analysis to investigate brain asymmetry.
( ) { ( )} ( )
Ss ω = F s m = Ss ω e ( )
jθs ω
(52.19)
The amplitude spectrum of the sampled signal is depicted in Fig. 52.4. It can easily be proven that the
spectrum of the sampled signal is the spectrum of the original signal repeated infinite times at frequencies
of nωs . The spectrum of a sampled signal is thus a periodic signal in the frequency domain. It can be
observed, in Fig. 52.4, that provided the sampling frequency is large enough, the wave shapes of the
spectrum do not overlap. In such a case, the original (continuous) signal may be extracted from the
sampled signal by low-pass filtering. A low-pass filter with a cutoff frequency of ωmax will yield at its
output only the first period of the spectrum, which is exactly the continuous signal. If, however, the
sampling frequency is low, the wave shapes overlap, and it will be impossible to regain the continuous
signal.
The sampling frequency must obey the inequality
ω s ≥ 2ω max (52.20)
Equation (52.20) is known as the sampling theorem, and the lowest allowable sampling frequency is called
the Nyquist frequency. When overlapping does occur, there are errors between the sampled and original
{ ( )} ∑ s(m)e
N −1
()
S k = DFT s m = − jkm
(52.21)
m =0
An inverse operator, the inverse discrete Fourier transform (IDFT), is an operator that transforms the
sequence S(k) back into the sequence s(m). It is given by
{ ( )} ∑S(k)e
N −1
( )
s m = IDFT S k = − jkm
(52.22)
k =0
It can be shown that if the sequence s(m) represents the samples of the band-limited signal s(t), sampled
under Nyquist conditions with sampling interval of Ts , the DFT sequence S(k) (neglecting windowing
effects) represents the samples of the FT of the original signal:
() ( )
S k = Ss ω ω
ω =k s
k = 0, 1, …, N − 1 (52.23)
N
Figure 52.5 depicts the DFT and its relations to the FT. Note that the N samples of the DFT span the
frequency range one period. Since the amplitude spectrum is even, only half the DFT samples carry the
information; the other half is composed of the complex conjugates of the first half.
2πf s 2π
∆f = = (52.24)
N T
where T is the duration of the data window. The resolution may be improved by using a longer window.
In cases where it is not possible to have a longer data window, e.g., because the signal is not stationary,
zero padding may be used. The sequence may be augmented with zeroes:
( ) {( ) () ( )
s A m = s 0 , s 1 , …, s N − 1 , 0, …, 0 } (52.25)
The zero padded sequence sA(m), m = 0, 1, … , L – 1, contains N elements of the original sequence
and L – N zeroes. It can be shown that its DFT represents the samples of the FT with an increased
resolution of ∆f = 2 πfsL–1.
()
w t =0 ∀t > T 2
The FT of a window W(ω) is thus real and even and is not band-limited.
Multiplying a signal by a window will zero the signal outside the window duration (the observation
period) and will create a windowed, time-limited signal sw(t):
() () ()
sw t = s t w t (52.26)
() () ()
Sw ω = S ω ∗ W ω (52.27)
where (*) is the convolution operator. The effect of windowing on the spectrum of the signal is thus the
convolution with the FT of the window. A window with very narrow spectrum will cause low distortions.
A practical window has an FT with a main lobe, where most of its energy is located, and sidelobes, which
cover the frequency axis. The convolution of the sidelobes with the FT of the signal causes distortions
known as spectral leakage. Many windows have been suggested for a variety of applications.
The simplest window is the rectangular (Dirichlet) window; in its discrete form it is given by w(m) =
1, m = 0, 1, … , N – 1. A more useful window is the Hamming window, given by
2π
( )
w m = 0.54 − 0.46 cos m ; m = 0, 1, …, N − 1
N
(52.28)
The Hamming window was designed to minimize the effects of the first sidelobe.
The window is shifted on the time axis to t = τ so that the FT is performed on a windowed segment in
the range τ – (T/2) ≤ t ≤ τ + (T/2). The STFT describes the amplitude and phase-frequency distributions
of the signal in the vicinity of t = τ.
In general, the STFT is a two-dimensional, time-frequency function. The resolution of the STFT on
the time axis depends on the duration T of the window. The narrower the window, the better the time
resolution. Unfortunately, choosing a short-duration window means a wider-band window. The wider
the window in the frequency domain, the larger the spectral leakage and hence the deterioration of the
frequency resolution. One of the main drawbacks of the STFT method is the fact that the time and
frequency resolutions are linked together. Other methods, such as the wavelet transform, are able to better
deal with the problem.
In highly nonstationary signals, such as speech signals, equal-duration windows are used. Window
duration is on the order of 10 to 20 ms. In other signals, such as the EEG, variable-duration windows
are used. In the EEG, windows on the order of 5 to 30 s are often used.
A common way for representing the two-dimensional STFT function is by means of the spectrogram.
In the spectrogram, the time and frequency axes are plotted, and the STFT PSD value is given by the
gray-scale code or by a color code. Figure 52.6 depicts a simple spectrogram. The time axis is quantized
to the window duration T. The gray scale codes the PSD such that black denotes maximum power and
white denotes zero power. In Figure 52.6, the PSD is quantized into only four levels of gray. The
spectrogram shows a signal that is nonstationary in the time range 0 to 8T. In this time range, the PSD
possesses a peak that is shifted from about 0.6fs to about 0.1fs at time 0.7T. From time 0.8T, the signal
becomes stationary with a PSD peak power in the low-frequency range and the high-frequency range.
() ∑ rˆ (m)e
M
− jωmTs
Sˆ xx ω = Ts xx
m = −M
(52.30)
N − i −1
( )
rˆxx m =
1
N ∑ x(m + i) x(i)
i=0
where N is the number of samples used for the estimation of the correlation coefficients, and M is the
number of correlation coefficients used for estimation of the PSD. Note that a biased estimation of the
correlation is employed. Note also that once the correlations have been estimated, the PSD may be
calculated by applying the FFT to the correlation sequence.
The Periodogram
The periodogram estimates the PSD directly from the signal without the need to first estimate the
correlation. It can be shown that
2
() 1
∫ ()
T
− jωt
S xx ω = lim E xte dt (52.31)
T→ ∞
2T −T
The PSD presented in Eq. (52.31) requires infinite integration time. The periodogram estimates the PSD
from a finite observation time by dropping the lim operator. It can be shown that in its discrete form,
the periodogram estimator is given by
() { ( )}
2
Ts
Ŝ xx ω = DFT x m (52.32)
N
The great advantage of the periodogram is that the DFT operator can very efficiently be calculated by
the FFT algorithm.
A modification to the periodogram is weighted overlapped segment averaging (WOSA). Rather than
using one segment of N samples, we divide the observation segment into shorter subsegments, perform
a periodogram for each one, and then average all periodograms. The WOSA method provides a smoother
estimate of the PSD.
() ()
2
Sss ω = H ω (52.33)
The PSD of the signal may thus be represented by the system’s transfer function. Consider a general
pole-zero system with p poles and q zeros [ARMA(p, q)]:
∑b z i
−i
H z = () i=0
p
(52.34)
1+ ∑a zi =1
i
−i
2
q
∑b z i
−i
()
2 i=0
H ω = 2
(52.35)
p
1+ ∑a zi =1
i
−i
z =e − jωTs
Several algorithms are available for the estimation of the model’s coefficients. The estimation of the
ARMA model parameters requires the solution of a nonlinear set of equations. The special case of q =
0, namely, an all-pole model [AR(p)], may be estimated by means of linear equations. Efficient AR
estimation algorithms are available, making it a popular means for PSD estimation. Figure 52.8 shows
the estimation of EMG PSD using several estimation methods.
performing some exercise, the muscle noise may become dominant. Additional noise may enter the system
from electrodes motion, from the power lines, and from other sources. The first task of processing is
usually to enhance the signal by “cleaning” the noise without (if possible) distorting the signal.
Assume a simple case where the measured signal x(t) is given by
() () ()
x t = s t +n t () () ()
X ω =S ω +N ω (52.36)
where s(t) is the desired signal and n(t) is the additive noise. For simplicity, we assume that both the
signal and noise are band-limited, namely, for the signal, S(ω) = 0, for ωmax ≤ ω, ωmin ≥ ω. Figure 52.9
depicts the PSD of the signal in two cases, the first where the PSD of the signal and noise do not overlap
and the second where they do overlap (for the sake of simplicity, only the positive frequency axis was
plotted). We want to enhance the signal by means of linear filtering. The problem is to design the linear
filter that will provide best enhancement. Assuming we have the filter, its output, the enhanced signal,
is given by
( ) ( ) ( ) Y (ω ) = X (ω ) H (ω )
y t = x t ∗h t (52.37)
where y(t) = ŝ(t) + no(t) is the enhanced output, and h(t) is the impulse response of the filter. The solution
for the first case is trivial; we need an ideal bandpass filter whose transfer function H(ω) is
1
()
H ω =
0
ω min < ω < ω max
otherwise
(52.38)
2 (52.39)
( ) ∫ ()( )
∞
= E s t + ξ − h τ x t − τ dτ
−∞
The integral term on the right side of Eq. (52.39) is the convolution integral expressing the output of
the filter.
The minimization of Eq. (52.39) with respect of h(t) yields the optimal filter (in the sense of minimum
squared error). The minimization yields the Wiener-Hopf equation:
()
Ssx ω e jωξ = H opt ω S xx ω () () (52.41)
H opt ω =()
Ssx ω ( )e jωξ
=
()
Ssx ω
e jωξ
(ω ) () ()
(52.42)
S xx Sss ω + Snn ω
If the signal and noise are uncorrelated and either the signal or the noise has zero mean, the last equation
becomes
H opt ω = () ()
Sss ω
e jωξ
() ()
(52.43)
Sss ω + Snn ω
The optimal filter requires a priori knowledge of the PSD of noise and signal. These are very often not
available and must be estimated from the available signal. The optimal filter given in Eqs. (52.42) and
(52.43) is not necessarily realizable. In performing the minimization, we have not introduced a constraint
that will ensure that the filter is causal. This can be done, yielding the realizable optimal filter.
SNR o t =() ()
sˆ t
E {n (t )}
(52.44)
2
o
as the optimality criterion. The optimal filter will be the filter that maximizes the output SNR at a certain
given time t = to . The maximization yields the following integral equation:
nn o o ≤ τ ≤T (52.45)
0
where T is the observation time and α is any constant. This equation has to be solved for any given noise
and signal.
()
hτ =
1
N
(
s to − τ ) (52.46)
where N is the noise power. For this special case, the impulse response of the optimal filter has the form
of the signal run backward, shifted to the time to . This type of filter is called a matched filter.
( ) ( ) ( ) ( ) [ ( ) ( )] ( )
y t = x t − nˆ t = s t + n t − nˆ t = sˆ t (52.47)
The output of the filter is the enhanced signal. Since the reference signal is correlated with the noise,
the following relationship exists:
() () ()
NR ω = G ω N ω (52.48)
which means that the reference noise may be represented as the output of an unknown filter G(ω). The
adaptive filter estimates the inverse of this unknown noise filter and from its estimates the noise:
() { ( ) ( )}
nˆ t = F −1 Gˆ −1 ω N R ω (52.49)
The estimation of the inverse filter is done by the minimization of some performance criterion. There
are two dominant algorithms for the optimization: the recursive least squares (RLS) and the least mean
squares (LMS). The LMS algorithm will be discussed here.
Consider the mean square error
(52.50)
{ ( )} [ ( ) ( )]
2
= E s 2 t + E n t − nˆ t
The right side of Eq. (52.50) is correct, assuming that the signal and noise are uncorrelated. We are
searching for the estimate Ĝ–1(ω) that will minimize the mean square error: E{[n(t) – n̂(t)]2}. Since the
estimated filter affects only the estimated noise, the minimization of the noise error is equivalent to the
minimization of Eq. (52.50). The implementation of the LMS filter will be presented in its discrete form
(see Fig. 52.12).
The estimated noise is
( ) ∑v w = v
n̂ R m = i i
T
m w (52.51)
i=0
where
[ ] [ ( )
v Tm = v0 , v1, …, v p = 1, n R m − 1 , …, n R m − p ( )]
(52.52)
w T
= [w , w , …, w ]
0 p
The vector w represents the filter. The steepest descent minimization of Eq. (52.50) with respect to the
filter’s coefficients w yields the iterative algorithm
w j +1 = w j + 2µ j v j (52.53)
where µ is a scalar that controls the stability and convergence of the algorithm. In the evaluation of
Eq. (52.53), the assumption
∂E { }≅ ∂
2
j 2
j
(52.54)
∂wk ∂wk
was made. This is indeed a drastic approximation; the results, however, are very satisfactory. Figure 54.12
depicts the block diagram of the LMS adaptive noise canceler.
The LMS adaptive noise canceler has been applied to many biomedical problems, among them
cancellation of power-line interferences, elimination of electrosurgical interferences, enhancement of fetal
ECG, noise reduction for the hearing impaired, and enhancement of evoked potentials.
Several adaptive segmentation algorithms have been suggested. Figure 52.13 demonstrates the basic
idea of these algorithms. A fixed reference window is used to define an initial segment of the signal. The
duration of the reference window is determined such that it is long enough to allow a reliable PSD
estimate yet short enough so that the segment may still be considered stationary. Some a priori infor-
mation about the signal will help in determining the reference window duration. A second, sliding window
is shifted along the signal. The PSD of the segment defined by the sliding window is estimated at each
window position. The two spectra are compared using some spectral distance measure. As long as this
distance measure remains below a certain decision threshold, the reference segment and the sliding
segment are considered close enough and are related to the same stationary segment. Once the distance
() ()
2
S ω −S ω
() ∫
ωM
R t
Dt ω = dω
()
(52.55)
ωM SR ω
where SR(ω) and St(ω) are the PSD estimates of the reference and sliding segments, respectively, and ωM
is the bandwidth of the signal. A normalized spectral measure was chosen, since we are interested in
differences in the shape of the PSD and not in the gain.
Some of the segmentation algorithms use growing reference windows rather than fixed ones. This is
depicted in the upper part of Fig. 52.13. The various segmentation methods differ in the way the PSDs
are estimated. Two of the more well-known segmentation methods are the auto-correlation measure
method (ACM) and the spectral error measure (SEM).
References
Cohen A. 1986. Biomedical Signal Processing. Boca Raton, Fla, CRC Press.
Kay SM. 1988. Modern Spectral Estimation: Theory and Application. Englewood Cliffs, NJ, Prentice-Hall.
Proakis JG, Manolakis DG. 1988. Introduction to Digital Signal Processing. New York, Macmillan.
Weitkunat R (ed). 1991. Digital Biosignal Processing. Amsterdam, Elsevier.
Widrow B, Stearns SD. 1985. Adaptive Signal Processing. Englewood Cliffs, NJ, Prentice-Hall.