You are on page 1of 19

TERM PAPER

ON
QUANTIZATION EFFECT IN
DIGITAL FILTERS

SUB:- ANALOG COMMUNICATION


SYSTEM

SUBMITTED TO:
SUBMITTED BY:

Mr.RAHUL SHARMA
RAVI RAJ

Roll.
No. B57
Regd. No 10810483

Section:-E6802

I would like to thank all those who encouraged me to do this project. I thanks to
Mr.RAHUL SHARMA who helped me a lot in editing the contents of this report
by making necessary conditions. I am also extremely thankful to my friends in
providing me with the latest knowledge regarding the report their immense help
and suggestions for improving the contents of the report are highly appreciable. I
also thanks to my parents and brother for their patience and support extended to me
all times.

I also gratefully acknowledge the valuable contribution of many academics for the
editing and finalization of this report. The contribution of the publication
department in bringing out this report is also duly acknowledged.

RAVI RAJ
QUANTIZATION EFFECT IN DIGITAL FILTERS
Abstract-A classification is given of the operations take place, such as additions and
various possible nonlinear effects that can multiplications. Irrespective of the encoding
occur in recursive digital filters due to of the signals (often referred to as
signal quantization and adder overflow. arithmetic, e.g., fixed- or floating-point
The effects include limit cycles, overflow arithmetic), multiplications and additions
oscillations, and quantization noise. A generally lead to an increase in the word
review is given of recent literature on this length required for the result of the
subject. Alternative methods of avoiding operation.’ As long as the number of
some of these nonlinear phenomena are operations performed on a signal remains
discussed. finite, the increasing word length can be
accommodated by using larger registers for
storing the outcomes of the arithmetical
operations than for the origin+ signals. In
INTRODUCTION that case, however, very long registers may
be needed and for that reason it is common
THE GROWTH of interest in the digital practice to reduce the word length.. In a non
processing of signals in the past few years recursive digital filter the effects of such a
has led to an extensive study of the word length reduction will be an additive
properties of digital filters, the basic error signal at the output, very similar to the
elements of almost every digital signal quantization error that is made in an
processing system. Many of the papers that analog-to-digital converter. In a recursive
have recently appeared as a result of these digital filter the situation is much more
studies deal with the effects of the finite complicated. First in every closed loop in
word length available for the representation Such a filter a word length reduction is
of the signals in digital filters. Because of Necessary to prevent the signals from
this finite word length almost every digital acquiring an ever increasing word length.
filter is nonlinear, and for this reason the The errors introduced by the nonlinear
output of the digital filter deviates from operation corresponding to this word length
what is actually desired. reduction may propagate in the loop,
The finite word length is a consequence of resulting in a number of undesired effects .
the encoding of the signals in a particular Secondly, these filters are often used
format (mostly binary) and of the fact that because they enable the realization of a
the signals must be stored in registers, high-Q. A consequence of these high-Q
which, of course, have a finite length. In values is that large gains in signal
itself this finite word length does not amplitudes are obtained. Scaling the signals
necessarily cause undesired effects in the such that no signal can ever become larger
filter. In every filter, however, arithmetical in magnitude than the largest possible
number that can be represented by the suffice to give him a proper understanding
available number of bits (given some of the subject.
arithmetic) will in general lead to an Thirdly, a number of results. concerning
impractically large number of bits for these finite word length effects is reviewed. These
signals in view of the desired dynamic results mainly concern. wave digital filters
range. This means that in most filters and second-order digital filters. Since higher
overflow may occur.’ If an overflow occurs order digital filters can be constructed as a
(which means that a signal becomes larger cascade or a parallel configuration of
than the maximum represent able number: second-order sections, some of these results
the overflow level), then the most significant can immediately be applied to such higher
bits must be altered in order to produce a order filters too.
word that can again be stored in a register.
Effects of signal quantization and of TYPES OF WORDLENGTH
overflow in recursive digital filters, such as
quantization noise, limit cycles, and REDUCTION
overflow oscillations, have been known for
a long time. Since the early work of Gold Reduction must be applied in every closed
and Rader, Kaiser, Jackson , and Ebert et al. loop in digital filters in which arithmetical
in which these effects were mentioned for operations take place. Most of the time this
the first time, a large number of papers have can be done by affecting the least significant
appeared dealing with this subject. Methods bits only (quantization), but sometimes
have been reported for the analysis of these overflow will occur, which requires a
Effects, and measures for suppressing the change of the most significant bits as well.
unwanted phenomena have been described, For both types of word length reduction
especially for second-order digital filter there exist a number of alternative
sections and for wave digital filter. approaches. These are well described in the
Unfortunately these results are scattered literature and will, therefore, only be
throughout the literature and a general indicated very briefly here. Quantization can
survey is not available. It is important, be performed by substituting the nearest
however, that the designer of a digital filter possible word that can be represented by the
be aware of all unwanted effects that may limited number of bits. The characteristic of
occur in the filter to be implemented. the nonlinear operation corresponding to this
Moreover, he should have available a round off (RO) quantization is depicted in
number of solutions to control the effects Fig. l(a) for the case of fixed-point number
that are most disturbing in his specific representation.
application. Another possibility consists of merely
The aims of this paper are the following. discarding the least significant bits. In a
First, a classification is given of the various representation of the signals by sign and
effects that can be caused by a word length magnitude this leads to magnitude
reduction required in recursive digital filters. truncation (MT) quantization with a
Secondly, a survey is presented of the characteristic as in Fig. l(b). If the signals
existing literature describing the various are represented, in a twos complement
results reported on this subject. We have format, the result is a twos complement or
tried to do this in such a way that someone value truncation (VT) quantization Fig. l(c)]
who is interested in a specific problem or . Both MT and VT introduce larger errors
result can find a subset of papers that may
than RO, but have advantages in the
hardware realization.
Moreover, depending on the type of
quantization used ,the filter behavior can be
very different, and therefore it is usually
worth considering the various alternatives
and their merits when designing a digital
filter for a given application.
Also if an overflow occurs, a number of
different measures can be taken. The
saturation characteristic depicted in Fig.
2(a) is obtained if the word that causes the
overflow is replaced by a word having the
same sign, but a magnitude corresponding to
the overflow level. Another possibility
consists in substituting the number zero in
the case of overflow [zeroing arithmetic
[23], see . 2(b)] . Discarding the bits that
cause the overflow has special advantages
with twos complement arithmetic, since
overflows in intermediate results do not then
cause errors as long as the final result does
not have overflow. If that is not so, then an
error will be introduced corresponding to the
characteristic in Fig. 2(c). There are various
other ways of dealing with overflow such as,
e.g.,the one corresponding to the
characteristic in Fig. 2(d), proposed in .

It is, of course, possible to have different


word lengths for the various signals in the
filter, resulting in different .quantization
Step sizes and/or different overflow levels.
None of the results that will be reported in
this paper depend explicitly on the
possibility of having different quantization
step sizes or overflow levels. Without loss of
generality, we therefore assume that all
quantizers (of which we assume K to be DESCRIPTION OF THE
present) have quantization step size 4: and
FILTERA ND THE ERROR
characteristics
Similarly, the L overflows nonlinearities
SIGNAL
will have characteristics Pl(x), I = 1 . .
* , L and overflow level To determine the degradation of the filter
performance resulting from the nonlinear
operations, it is obvious to compare the
output of the actual nonlinear digital filter
(NLDF) with that of an ideal linear and,
therefore, nonrealizable filter.
This filter is obtained from the actual filter
by replacing all quantization nonlinearities
Qk(x) and all overflow nonlinearities
Pz(x) by linear characteristics. This filter
will be referred to as the associated linear
filter (ALF). Both filters are schematically
depicted in Fig. 3 . They are excited by the
same input signal u(n). The box labeled
S is a linear memoryless (delay free) device
which is the result of extracting all delay
elements and all quantizers and overflow
nonlinearities from the NLDF. The box S is
the same in both the NLDF and the ALF. All
arithmetical operations are performed inside
it. For describing the filters a state-space
approach is most convenient, and to this end
the signals in the registers (delay elements)
can be used as a state vector . The signals
in the NLDF will be denoted by Roman
symbols, and in the ALF by Greek symbols.
Assuming a total of I delay elements in each
filter we have the following state vectors:

If we denote the output signal of the NLDF


by y(n), and that of the ALF by ~(n)t,h ent het
wo filters are described by relations of the form
Where f and g are nonlinear functions of x
and u, and A, B, C, and D are constant
matrices (I X I, 1 X I, I X 1, and 1 X
1,
Respectively).It will be assumed throughout
the paper that the ALF is stable, which
means that the characteristic values of A are
less than 1 in magnitude.
The comparison between the actual filter
(NLDF) and the ideal counterpart (ALF) is
made by subtracting the outputs of the two
filters. The difference e(n) between the two
outputs will be referred to as the output
error signal. The response of the NLDF will
be investigated for three different input
conditions.
I) Zero Input: In this case the
stability of the ALF assures

that E(n) -+ 0 if n -+ m, independent of the


initial conditions (g(0)). It suffices in this
case to consider the response of the
NLDF only.

2) Nonzero, Deterministic Input: In


this case the input signal is a well-defined
discrete-time function, e.g., a periodic
Signal. With such an input signal the output
error signal e(n) will be investigated,
assuming that both filters are started with
Identical initial conditions: g(0) = x(0).
3) Nonzero, Stochastic Input: In
this case the input is a stochastic process.
Again it will be assumed that E(0) =x(O)
and some of these statistical properties of
e(n) will be determined.
SUMMARYOF FINITEW
ORDLENGTH EFFECTS
A classification of all the finite word length
effects known at the present time has been
given in Table I. Each of these effects will
be discussed in one of the subsequent
sections. A review of the corresponding
literature has been given, relating to
different aspects of the mentioned effects.
No attempt has been made, nor would it be
possible within the scope of this paper, to
make the list of references complete. The
literature has grown so rapidly that selecting
a proper subset was the only alternative.'
We have included all recently obtained
results concerning finite word length effects
that we knew to exist at the time when this
paper was written. We have excluded:

1) Papers that only implicitly deal with the


finite word length, such as papers describing
optimization procedures with regard to
quantization noise,
2) Papers dealing with the finite word length
of the confidents,
3) Monographs, although some recently
published mono-graphs have incorporated
sections on finite word length effects , and

4) papers that have provided tools for


analyzing these effects, but that do not
explicitly deal with digital filters.
This category comprises, for example, a
large number of papers on the stability of
sampled data systems and other basic papers
from control theory. As far as they have
enabled an analysis of finite word length
effects in digital filters, they have been
referred to in the papers describing
such analysis.
It is our opinion that the references in the
table cover the major part of the area, and is a solution of (6), which means that if the
that this table may be of help to people who zero state is reached after a certain time,
are interested in studying finite word length then the filter response will be identically
effects, but are overwhelmed by the vast zero from that time on. If the zero state is
amount of papers that deal with this subject. reached from every possible initial
condition, the NLDF is said to be zero-input
stable; in all other cases it is zero-input
unstable.
A limit cycle may be characterized by its
period N and N successive values of the
state vector

such that

for all initial conditions, and thus

It has
been observed, however, that in the actual
system (NLDF) the output y(n) does not
always converge to zero. All components
of x(n) can only attain a finite number of If the limit cycle has an amplitude f the
values since they are quantized and bounded order of thove erflow level p , and this
in amplitude. This makes the NLDF a finite- amplitude is not affected when the
state machine. An immediate consequence quantization step size 4 is decreased, then
of this is that if x(n) does not converge to the limit cycle mainly results from the
zero with zero input, it must become overflow nonlinearities. In that case it is
periodical after some finite time. Thus, in often referred to as an overflow oscillation.
the absence of an input signal, the digital
filter will either reach the zero state after a Results for Wave Digital Filters
finite time, or a periodic oscillation will
result, which is referred to as a limit cycle or
A well-known method for investigating
zero-input stability is the second method of
Zero-input limit cycle 1471. Lyapunov. Application of this method
requires the search for a generalized energy
Different limit cycles may result if the filter function, the Lyapunov function. For a
is started with different initial conditions, digital filter this is, in general, very
but there may be other initial conditions complicated due to the highly discontinuous
from which the zero state is reached. It will nonlinear characteristics of the quantizes.
be assumed that An important exception is formed by the
wave digital filters (WDF) . For these filters
Fettweis and Meerkotter have used what is Using arguments similar to those in ,
called the pseudo power, which is essential Meerkotter and Wegener and Verkroost and
in these filters, as a Lyapunov function. In Butterweck have derived structures for a
this way they have been able to prove that second-order digital filter section that are
the absence of zero-input limit cycles can be free of limit cycles when magnitude
guaranteed in WDF for which 1) the ALF is truncation is used for quantization, and do
pseudopassive or pseudo lossless, 2) the not have overflow oscillations for any
nonlinearities are situated at appropriate overflow characteristic.
places in the filter and satisfy A structure of a second-order section that
has been investigated extensively is the
direct form. it has been shown that as
regards zero-input limit cycles this section
is equivalent to the direct form 1 .) In the
recursive part of this filter the overflow
nonlinearity must bep laced as indicated in
Fig. 4. Two possible ways of placing the
quantizers have been indicated. Quantization
can be performed immediately after every
multiplication. In that case there are two
quantization nonlinearities, and Q3(x) = x.
It is also possible to add the results of the
two multiplications with full precision,
which means Q1 (x) = Q2 (x) = x , and
then only one quantization is needed. If,
however, the filter is implemented with
distributed arithmetic hen only one
quantization can be performed.
The results given for the one quantization
case also apply to this implementation.
It has been proven that overflow oscillations
Equation (12) is satisfied by the will not occur in this second-order
characteristic of a magnitude truncation section if the overflow nonlinearity P(x) is
quantizer, and every overflow characteristic contained in the hatched area of Fig. 5 .
satisfies (1 3). This meanst hat it is possible Both the saturation nonlinearity in Fig. 2(a)
to design a wave digital filter of arbitrary and the nonlinearity in Fig. 2(d) are
order, without limit cycles and overflow contained inside this area. Thus, although
Oscillations. the choice of overflow characteristic is more
limited than for wave digital filters if
overflow oscillations have to be avoided, it
is possible to obtain a filter that is free from
overflow oscillations.
This does not guarantee, however, absence
Results for Second-Order of other types of limit cycles.
Digital Filters It has been demonstrated in that if RO
quantizers are used, then irrespective of
whether one or two quantizes are used limit Quantizer case, Fig. 6(b) indicates the
cycles will always occur if situation when two RO quantizers are used.
The larger triangle depicts the stability
region of the linear filter. The vertically
hatched regions correspond to inequality
(14) and denote coefficient values for which
limit cycles will always occur.
In the cross-hatched regions, labeled
"stable," the absence of limit cycles can be
proved by means of frequency domain
criteria derived in . For the inner region in
Fig. 6(b),Jackson [9] has shown that limit
cycles of period 1 and 2 will be absent.
Moreover, using an effective value linear
model he has indicated that limit cycles of
other periods are unlikely to occur.
Computer simulations have confirmed the
absence of limit cycles in this region. It can
be concluded that the use of RO for
quantization will lead in most practical cases
Not only do limit cycles exist in this case, to the occurrence of limit cycles with zero
but a limit cycle will result for every input. In that case the only thing to do is to
nonzero initial condition, because with diminish the amplitude of occurring limit
values of b satisfying the zero state cannot cycles by increasing the number of bits used
be reached from any other state. For the two to represent the signals. The number of bits
RO quantizer case it is possible to determine required for this can be estimated by using
a lower bound for the amplitude of limit upper bounds for the amplitude of limit
cycles cycles.
Three different types of amplitude bounds
for limit cycles have been given.

1) Absolute Bounds: Several authors


have derived bounds on the
where integer (x) denotes the integer part of maximum value of the quantization
x. This means that there must exist at least error for very general types of
one limit cycle with an amplitude larger than digital filters . Application of
or equal to Ami, and it has been shown that
limit cycles with this or larger amplitude are
most likely to occur For I b I < 0 * 5 limit
cycles are also possible with RO quantizers,
and especially limit cycles of periods 1 and
2 can be found to occur for a and b values
inside the horizontally hatched regions in
Fig. 6.This figure summarizes the results
concerning limit cycles when RO quantizers
are used. Fig. 6(a) applies to the one RO
guarantee absence of limit cycles in the
output. However, the bound is very simple
to evaluate and, as proved by Long and
Trick for the second-order section, the
maximum value of the limit cycle will not
exceed this bound by a factor ore than 2.
3) Approximate Bound: Jackson [9] derived
an estimate of the limit cycle amplitude
based on an effective value linear model.
The value obtained with this bound for the
second order section is the same as that in (1
5) and it will be clear that this bound, which
is actually a lower bound, can be violated.
It has been shown [4] that limit cycles with
amplitude larger than A min do indeed
occur.
The results for the case where magnitude
truncation is used for quantization have been
summarized in Fig. 7. Fig. 7(a) applies to
the one MT case and Fig. 7(b) to the two
MT case.
When MT is used for quantization no
coefficient values inside the triangle exist
for which limit cycles occur for alE initial
conditions. For the case of one MT
quantizer, it is possible to prove the absence
of zero-input limit cycles in a rather large
area of the parameter plane shown cross-
hatched in Fig. 7(a). For the remaining area
inside the triangle in this figure the only
result that has been derived analytically is
that limit

these bounds for determining the internal


word length of the filter will guarantee the
absence of zero-input limit cycles in the
output. However, these bounds tend to be
rather pessimistic, resulting in an
uneconomical use of the available number of
bits.
2) Rms Bound: Sandberg and Kaiser [lo]
have derived a bound on the rms value of
the quantization error. This bound gives no
information on the maximum amplitude of a
limit cycle. Therefore, use of this bound for
determining the internal word length will not
Fig. 7(b),while the cross-hatched region in
this figure is the stability region obtained
with the criteria from . Simulations on a
digital computer have shown that no limit
cycles occur in the remaining region, and
that limit cycles of period different from 1 or
2 do not occur for any set of parameters
inside the triangle . Mathematical proof of
this interesting fact is lacking, however. The
case of floating-point arithmetic has been
analyzed by Kaneko and Lacroix. Kaneko,
while excluding the possibilities of overflow
and underflow, proved that limit cycles of
considerable amplitudes can be found with
floatingpoint
arithmetic. Lacroix has studied the limit
cycles that may result from underflow and
he found regions for the coefficients of a
second-order digital filter for which such
limit cycles can be found.
Until now we have only considered time-
invariant quantization nonlinearities. We
will conclude this section by discussing
some randomized and controlled
quantization methods that have been
proposed for suppressing limit cycles. With
a view to combining the advantages of MT
as regards limit cycles and of RO as regards
the quantization errors that occur with a
nonzero input signal (see Section VI),
Buttner and Kieburtz et al. have proposed to
switch between these two quantization
measures randomly, but in such a way that
RO is used during the main part of the time.
cycles of period 1, 2, and 4 are not possible. This does not guarantee a complete
Computer simulations have shown that limit avoidance of limit cycles, however, and the
cycles of other periods only occur for values analysis of the remaining limit cycles is
of a and b inside the two trapezoid areas complicated. Another randomized
indicated in Fig. 7(a) . Not for every set (a, quantization has been proposed by Buttner,
b) inside these areas can limit cycles be in which a small amplitude pseudorandom
found, however. A detailed description of sequence is added to the signal before
the fine structure of these regions is given in quantization. This measure differs from the
The filter with two MT quantizers has been injection of dither as already proposed by
analyzed by Kao, who derived regions Blackman in that it injects the noise just in
where limit cycles of periods 1 and 2 occur. front of the quantizer. Using this method,
These regions are horizontally hatched in the correlated nature of the error sequence is
destroyed and what remains is very similar
to RO noise.

but using MT in (1 6) when I s(n)l < q.


It should be kept in mind that all these
randomized and controlled quantization
measures require additional hardware to
generate the signals that control the
quantizer.

DETERMINISTIC INPUT
SIGNAL
it was indicated that in the case of a
nonzero input signal the effects of the
nonlinear operations can be studied by
considering the output error signal e(n). In
this case it is rather essential to try to
Controlled quantization (CQ) has been distinguish between the effects of the
proposed whereby the signal is quantized to quantization and the effects of overflow, for
a larger or a smaller value depending on the the following reasons.
state variables in the filter. An algorithm has Quantization is a continuously operating
been given which guarantees the absence of error source since during every sample
limit cycles of periods larger than two. period T the filter calculates new signal
A possible implementation of this algorithm values that must be quantized before they
is depicted in Fig. 8. The quantizer Q can replace the old values in the registers. In
quantizes the signal r(n) upwards or a properly designed filter overflow may not
downwards, depending on the sign of the occur at all or rarely, since if it occurs it
signal s(n). produces very large errors.

Quantization Effects

When no overflows occur, then e(n) is only


caused by the errors introduced by the
where [x] is the smallest integer greater than
quantizers. When e(n) is periodic, this error
or equal to x and [x] is the largest integer
is sometimes referred to as a limit cycle.
less than or equal to x. In Fig. 8the control
There is an important difference compared
signal s(n) is taken equal to
with the zero-input situation, however, since
it follows from that in that case the zero
state is a solution and limit cycles are thus
Using a controlled quantization with a not necessarily present. If the input signal
different control signal, it is also possible to differs from zero, the output error e(n)
obtain a filter that is free from limit cycles cannot, in general, be expected to converge
of any period. This is obtained, for example, to zero as well because that would mean that
by taking . the NLDF has the same output as the ALF.
The latter filter can handle the signals
with infinite precision, whereas the NLDF truncation or floating-point arithmetic is
has to perform quantizations. It is, therefore, used, the quantization errors will depend to a
not very useful to generalize the zero-input great extent on the input signal: the sign of
stability in a straightforward way to the error caused by an MT quantizer is
incorporate periodical input signals. always opposite to the sign of the signal, and
According to such a definition almost no with floatingpoint arithmetic even the
recursive digital filter would be stable. Most magnitude of this error is highly dependent
results concerning nonzero input limit cycles on the magnitude of the signal. Modeling the
are obtained for the situation of a constant error signal as a white noise source is,
input signal (period 1). therefore, not possible. Results for
With such an input the advantages of quantizers of this type when the input signal
magnitude truncation with respect to limit is deterministic have not been given so far.
cycle behavior disappear, since at the new
equilibrium point the MT quantizer will ERRORS DURING NORMAL
behave the same as a VT quantizer. If the
quantization is controlled, however, such OPERATION
that the region that originally lies at x = 0
is transferred to the equilibrium point, then In the foregoing we have considered the
again absence of limit cycles can be effects of quantization and of overflow
obtained. A second-order section in which assuming very restricted input situations. It
such a controlled quantization has been was shown that limit cycles and overflow
applied is proposed by Verkroost and phenomena could occur that may seriously
Butterweck. The randomized quantization in affect the behaviour of the filter. It was
which a random signal is added to the signal indicated that the occurrence of limit cycles
before quantization will also for nonzero depends to a great extent on the type of
input result in a randomlike quantization quantization and on the place where the
error, but random switching between MT quantizers are inserted into the filter.
and RO obviously will not accomplish this Structures have been discussed that enable
result. such unwanted phenomena to be completely
If the signals on the quantizers take values avoided. To be specific, quantization limit
spread over a large range of the quantization cycles can be avoided in most wave digital
characteristics, the energy of the filters by the proper placing of MT
quantization errors will be spread over the quantizers. Most digital filters composed of
spectrum. a cascade or parallel form of second-order
Extensive studies of this situation for a sections will be free from zero-input limit
fixed-point RO quantizer have shown that a cycles if the recursive part of every section
very good description of this error is is implemented with one MT quantizer. For
obtained when it is modeled as a white noise the complete avoidance of limit cycles in
source with uniform distribution function in such a filter random or controlled
the interval (-4/2, 4/2) . In that case the quantization may be used.
quantization error is referred to as With zero inputs, overflow oscillations will
quantization noise. With this model for the not occur in wave digital filters and may be
quantization errors, the results become avoided in the other filters by selecting a
independent of the type of the input signal. suitable overflow characteristic (e.g.,
For this reason the quantization noise will be saturation).
discussed in Section VII. If magnitude
As indicated, with a nonzero input several effects of quantization and of overflow can
unwanted phenomena may occur. In wave be described in stochastic terms. There will
digital filters such phenomena can be then be no limit cycles and overflow.
avoided by choosing a proper overflow
characteristic. A solution for the second- Models for the Quantization
order digital filter has been discussed that
utilizes error feedback. All these measures Errors
enable a filter to be designed that gives an Based on the fundamental work by Bennett ,
improved performance for the specific input Gold and Rader have proposed to model the
condition. These filters will be used in quantization errors of a fixed-point RO
practice, however, to filter input signals quantizer as additive white noise,
whose precise nature is unknown. Such uncorrelated with the signals, and with
input signals are, for example, speech or variance
seismic signals. It is, of course, important to Thus with x(n) the signal to be quantized,
know that the filter will not start oscillating the output of the RO quantizer may be
if the input signal is zero during some time, described by
but it is even more important to know the
impact of the quantization and the overflows
during normal operation. When considering where r(n) is a white noise process.
the various alternatives that have been Although apparent counterexamples exist,
mentioned to avoid nonlinear phenomena, it such as limit cycles, this model gives very
is, therefore, of great importance to consider reliable results in almost all cases when the
their influence on the filter performance filter is driven by a nonzero input signal.
during normal operation as well. Value truncation, being merely a shifted
The effects of quantization, and the problem version of RO, can likewise be modeled if
of scaling associated with overflow during an additional constant term of amplitude q/2
the normal operation of a digital filter have is added:
been discussed at length by Jackson and in a
review paper by Oppenheim and Weinstein.
The discussion of these effects in this paper The situation is different for magnitude
will, therefore, be rather brief-only truncation where the quantization errors are
emphasizing some new aspects brought highly correlated with the signals to be
about by the different quantization schemes quantized. Liu and Van Valkenburghave
that have been proposed in connection with shown that by subtracting 4/2 sign (x(n))
the limit cycle problem. from the quantization error, the remaining
part may again be modeled as a white noise
A mathematically tractable and, therefore, process uncorrelated with the signal x(n).
convenient description of this “normal Thus the MT quantizer may be modeled by
operation” is to use stochastic processes as
input. The stochastic properties of these
processes can then be selected to fit as much
as possible those of the actual input. A This model has been shown to give an
further simplification will often be to accurate description of the quantization
assume the input processes to be Gaussian. errors. Application of this model in systems
If the input is a stochastic process, both the with several quantizers leads to very tedious
computations. Therefore, it has been
proposed by Claasen et al. to use the filter. The usefulness of the models for this
quasilinearization method known from analysis stems in a large amount from the
control theory to describe the MT quantizer. fact that the register length can only be
With the assumption of a Gaussian process adjusted with units of one bit. Therefore, an
x(n),this method leads to the model accuracy in the determination of the power
of the quantization noise of about 50 percent
is often adequate.
The determination of the output power of
the quantization noise also enables a
comparison of different structures with one
where is the variance of x(n) and r'(n)
another. In a cascade of second-order
is a white noise process with variance
sections, for example, it allows for the
determination of the pole-zero ordering that
gives the minimum quantization noise, The
procedures for the determination of the
uncorrelated withx(n). This model is power of the output noise if RO quantization
computationally more tractable than that of is used is well described ,From the
and has been shown to be adequate in discussion it will now be apparent that
almost all situations. The errors caused by similar procedures exist for other types of
randomly switching between RO and MT quantization. With MT or CQ, part of the
are difficult to model, and so far no output noise will be directly proportional to
quantization noise analysis has been the output signal and will, therefore, be less
reported for this case. Errors caused by disturbing. This part may be subtracted from
random quantization as proposed by Buttner the output error
can be modeled much alike RO . Controlled
quantization has been shown to be very and (Y can be chosen so as to minimize the
similar to MT as regards the quantization power in eM(n). Analysis of several filters
errors, and here too the quasilinearization with MT quantizers has been carried out by
method leads to a rather simple model. Dehner , and he found that the power of
All models considered hitherto hold for a eM(n) was typically in the order of 5-10
fixed-point number representation. times larger than the power of e(n) when RO
Quantization noise models also exist for is used. When two bits more are taken for
floating-point arithmetic, but this situation is the representation of the signals, the power
rather complicated . in eM(n) will be decreased by a factor of 16,
which makes MT comparable with RO again
Quantization Noise Analysis as regards to quantization noise, while
maintaining its superior stability properties.
These results and results obtained with other
The statistical models of the quantization implementations allow the conclusion that
errors that have been mentioned in the improvement of limit cycle behavior must
previous section can be used to determine be paid by an increase of the quantization
some of the statistical properties of the total noise. Decreasing both requires additional
noise in the output signal. The motivation hardware. Both effects must, therefore, be
for such an analysis is to determine the considered simultaneously and an optimum
register length necessary for achieving a solution cannot be indicated. It should be
desired level of performance of the digital kept in mind that, from a perception point of
view, limit cycles are more disturbing than the digital filter is intended. Many structures
quantization noise and thus in some have been proposed in the literature because
applications suppressing limit cycles may be of their advantages with respect to
more important than reducing quantization quantization noise. These have not been
noise. considered here. Investigation of other
The analysis of overflow with a stochastic quantization and overflow effects need to be
input is extremely complicated. Typically, done, however, to be able to compare the
what may happen is that bursts of overflows overall behavior of these structures. Some of
occur due to error propagation after an the methods used for studying limit cycles
incidental overflow that has been caused by and overflow phenomena are applicable to
the input. It is fairly easy to determine for more general filter structures too, but, until
any signal il(n) in the ALF the expected now little effort has been spent in this
number of times that the overflow level is direction.
exceeded per unit of time. Based on such
figures, the overflow level p can be chosen REFERENCE
in such a way that the number of overflows
caused by the input per unit time will be
acceptable (a typical figure could be an [ l ] R. B. Kieburtz, “An experimental study
average of one overflow per lo6 sample of roundoff effects in a tenth-order recursive
times). digital filter,” IEEE Trans. Commun.
Error propagation may cause overflows in (Concise Papers), vol. COM-21, pp. 757-
the NLDF much more frequently with this 763, June 1973.
overflow level, and a larger overflow level
may be needed for satisfactory operation. [2] -, “Rounding and truncation limit cycles
The problem is very complicated and in a recursive digital filter,” IEEE Trans.
methods of computing the average number Acoust., Speech,S ignal Processing
of overflows and the required overflow level (Corresp.), vol. ASSP-22, p. 73, Feb. 1974.
have not been reported so far. The error
feedback circuit proposed for eliminating [3] M. Buttner, “Some experimental results
overflow phenomena in the second order concerning randomlike noise and limit
digital filter of Fig. 4 also provides a better cycles in recursive digital filters,”
overflow behavior with a stochastic input. Nachrichten Technische Zeitschrift, pp.
402-406, Nov. 1975.
CONCLUSION. [4] S. R. Parker and S. F. Hess, “Limit-cycle
oscillations in digital filters,” IEEE Trans.
A number of different effects have been Circuit Theory (Special Issue on Active
discussed that occur in recursive digital andDigitalNetworks), vol. CT-18, pp. 687-
filters due to quantization and overflow. 697, Nov. 1971.
For second-order sections and wave digital
filters these effects have been studied in [5] T. A. C. M. Claasen, W. F. G.
detail and a number of alternative solutions Mecklenbrauker, and J. B. H. Peek,
have been described to eliminate or to “Some remarks on the classification of limit
reduce these unwanted effects. A proper cycles in digital fiiters,”Philips Res. Rep.,
choice between these solutions will greatly vol. 28, pp. 297-305, Aug. 1973.
depend on the specific application for which
[6] G. A. Maria and M. M. Fahmy, “Limit
cycle oscillations in a cascade of first-and
second-order digital sections,” IEEE Trans.
Circuitsand Systems, vol. CAS-22, pp. 131-
134, Feb. 1975.

[7] C. Kao, “An analysis of limit cycles due


to sign magnitude truncation in
multiplication in recursive digital filters,”
in Proc. 5th Asilornar Con$ circuit and
System Theory, Pacific Grove,
521
CA, 1971, pp. 349-253.

[ 8] T. Kaneko, “Limitcycle oscillations in


floating-point digital filters.” IEEE Trans.
on Audio Electroacoust., vol. AU-21,
pp. 100-106, Apr. 1973.

[9] L. B. Jackson, “An analysis of limit


cycles due to multiplication rounding in
recursive digital filters,” in Proc. 7th
Annu. Allerton Coni Circuit and System
Theory, Monticello, IL, Oct. 1969,

[10] I. W. Sandberg and J. F. Kaiser, “A


bound on limit cycles in fixed-point
implementations of digital filters,” IEEE
Trans.Audio Electroacoust., vol. AU-20,
pp. 110-112, June 1972.

[11] J. L. Long and T. N. Trick, “A note on


absolute bounds on limit
cycles due to roundoff errors in digital
fiiters,” IEEE Truns. Audio Electroucoust.,
vol. AU-21, pp. 27-30, Feb. 1973.

[12] -, “A note on absolute bounds on


quantization errors in fixedpoint
implementations of digital filters,” IEEE
Trans, Circuits pp. 69-78. and Systems
(Lett.), vol. CAS-22, pp. 567-570, June
1975.

You might also like