You are on page 1of 4

1272 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-34, NO.

12, DECEMBER 1986 ,

drawn with replacement fromC. We want to show that i f A # of improving the original convergence characteristics, yet retaining the
B , then the sum ofthe numbers in A is different from the sum advantage of hardware simplicity. Based on a recently proposed theory
of the numbers in B . For this, let for the sign algorithm, a practical design method is derived for the new
A = (al, a2, 0 3 ) ; al 5 a2 s
a3; a i , a2, a3 E e
algorithm, and it is shown by computer simulationthat the new algorithm
in fact performs significantly better than the original algorithm.
B = ( b l , bz,b3); bl S b2 5 b3; bl, b2, b3 E e
s o = a l + a 2 + a 3 and s b = b l + b 2 + b 3 . I. INTRODUCTION
In
manyadaptivefiltering
applications,
such asecho
When A # B , then one of thefollowingthreeconditions cancellation, equalization, and noise cancelling, the stochastic
occurs: iterationalgorithm(SIA) is wellknownandwidelyimple-
i) a3+b3; a l , a2, bl, 62 any mented 111-131. Compared to other more sophisticated meth-
ods such as Kalman filtering, the SIA requires less hardware
ii) a3=6,;a2#b2; a i , bl any and, hence,. is less expensive. An even simpler algorithm is the
iii) a3= 6,; az= bz; a, #6,. sign algorithm (SA), in which only the polarity information of
the error signal is used for the filter’s coefficient update 111.
Condition i) implies So z sb since, from (A. 11,
However, it hasbeenshownthat, in general,eitherthe
a3>b3 * a3>3b3 2 Sb * a3>Sb convergence of the SA is very slow if the step size of the
* a l + a Z + a 3 > s b* s,>sb adaptive algorithm is adjusted to give an acceptable residual
error, or, if the step size is increased for fast convergence, the
andsimilarly, a3 < b3 so < sb. Ontheotherhand, ii)
implies so # sb since if a3 = b3 and a2 > b2, then residual error may not be acceptable [I]. Such characteristics
of the SA seriously impede its; practical implementation in
a z > 3 b 2 > b l + b 2* a2+a3>Sb* S,>Sb manyapplications.Inthiswork,avariationoftheSAis
andsimilarly, a3 = b3 and a2 < b2 * so < s b . Finally, iii) proposed which overcomes this weakness, yet offers a similar
implies so # s b since if a3 = b3,a2 = 62. and al > b l , then realizationsimplicity. It isshown in thefollowingthatthe
proposed algorithm operates as if two SA’S are working in
al>3b1 * ai+a2+a3>3bl+b2+b*
3 sa>sb cooperation. Then, the results in 141 for the SA can be applied
to the design of the present algorithm. Also, because of this
andsimilarly, a3 = b3, a2 = bz, and al < bl * s, < sb, relation to the SA, the proposed algorithm is named the dual
concluding the proof. sign algorithm ,(DSA) for convenience.
Example 11. THE DUAL SIGN ALGORITHM
Let Nt = 3 and C = { 0, 1, 4 }, satisfying condition (A. 1). Many applicationsof
adaptive filtering,
includingthe
The 10 ordered triples that can be formed by drawing, with above-mentionedones,canbeconsideredassolvingan
replacement, integers from C are (0, 0, 0), (0,0, l), (0, 1, l), identification problem (Fig. 1). In Fig. 1, the tap-gain vector
(1, 1, 11, (0, 0 , 4 ) , (0,1 , 4 ) , ( 1 , 1 , 4 ) , (0, 4 , 4 ) , (1, 4 , 4 ) , and of the FIR digital filter at the (k + 1)st iteration, c(k + 1) =
(4,4, 4).The sum of their three elements are, respectively, 0, [co(k + 1) cl(k +1) . . . c ~ - l ( k+ l)], isupdatedby the
1, 2, 3, 4, 5, 6, 8, 9, and 12. It is clear in this examplethat the relation 141
three numbers in any ordered triple are uniquely determined
by the value of their sum. c(k + 1) = c ( k )+ Ks,*r(k)iz(k) (1)
REFERENCES in the SIA and
[l] R. J. ,Westcott, “Investigation of multiple FM/FDM carriers through a
satellite TWT operating near saturation,” Proc. IEE, vol. 114, June
c ( k + 1) = c ( k )+ K (sign
~ ~ r(k))u(k) (2)
1967. in the SA where sign denotes the limiting operation as shown.
[2] J. C. Fuenzalida, W. L. Cook, and 0. Shimbo, “Time-domain analysis Equivalently, the error r ( k ) is quantized to 1 bit via the sign
of intermodulationeffectscaused by non-linear amplifiers,” COM-
operation.
SAT Tech. Rev., vol. 3, Spring 1973.
[3] -, “The intermodulationanalyser(CIA4)user’smanual,” COM- Here we propose a simple modification of the SA (i.e., the
SAT Tech. Memo. CL-48-72, Sept. 1972. DSA) for improvingits convergence characteristics. Insteadof
[4] P. Y. K.Changand R . J. F.Fang,“Intermodulationinmemoryless representing r ( k ) by a two-level signal, we quantize r ( k ) to a
nonlinear amplifiers accessed by FM and/or PSK signals,” COMSAT four-level, i.e., 2-bitsignalaccordingtothequantization
Tech. Rev., vol. 8, Spring 1978. schemeasshown in Fig.2.Thentheadaptation relation
[5] IMSL Mathematical Subroutine Library, IMSL Inc., Houston, TX. becomes
c(k + 1) = c ( k )+KDi(k)U(k) (3)
Dual Sign Algorithm for Adaptive Filtering where

c. P. KWONG
Abstract-A new algorithm, which is a variant of the sign algorithm, is
nroposed fnr the adaptive adjustment of an FIR digital filter with an aim
i ( k )= 1-::-L2
- hT< r ( k )< 0
r ( k ) 2 rT
r(k) < -rT.

Paperapproved by theEditor for TransmissionSystems of theIEEE


Communications Society. Manuscript received July 31, 1985; revised March There is a strong intuition in extending the1-bit quantization
1,1986. in the SA to a 2-bit .quantization in the DSA. Motivated by the
The author is with the Department of Electronics, The Chinese University useofatime-varyingstepsize in previousstudies [2] for
of Hong Kong, Shatin, N.T., Hong Kong. improvingthe
convergence properties of an SIA, it is
IEEE Log Number 8611283. conjectured that if quantization is applied to the error signal, a

0090-6778/86/1200-1272$01.OO 0 1986 IEEE

Authorized licensed use limited to: Chinese University of Hong Kong. Downloaded on September 18,2020 at 08:23:16 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS
COMMUNICATIONS,
ON VOL. COM-34, NO. 12, DECEMBER 1986 1273
quantizationlevels in (5) depends on the magnitude of the
error. Second, the performance of the DSA, as will be shown,
is dependent on the optimization of the threshold r r ; in the
power-of-2multiplication scheme, the way determining the
transition from one level to the other is fixed.
‘N-1 Due to the very nature of the DSA as described above, it
will be shown that the design of the DSA, i.e., the determina-
tion of the values of K D ,&,, L 2 , and r r will need extensive
knowledge of the SA. Fortunately, a detailed study on the SA
has been carried out in [4] and[6],andtheresultstherein
provide much insight and a practical solution to the present
problem.
111. THEDESIGNOF THE DSA
NOISE u(k)
In the following analysis, the basic assumptionsare identical
to thatpresented in [4]. Particularly,our analysis will be
Fig. 1 . Adaptiveidentification of anunknown system. heavily based on the results in Section I11 of the same paper.
Furthermore, we shall use a similar set of notations in order to
avoid confusion.
Referring to Fig. 1, we define ~ ( k=) e(k) - e^(k);a@) =
E { e 2 ( k ) } ;ai = E { u 2 ( k ) } and
; R ( k ) = u,(k)/a,. Then the
dominant specification in the design of either the SA or the
DSA is the residue R ( m ) . When R ( m ) is specified, the step
size K S Ain the SA can be found from the expression [4]

where a: = E { a z ( k ) } . To attainthe same R ( m ) , itis


reasonable that the step sizeK Din the DSA should be set to the
same value of K S Adetermined by (6) if L I is set to 1.
Whilethelower quantizationlevel in (4) isdesigned to
provide an acceptable residual error, the higher quantization
Fig. 2. Thequantizationscheme of the DSA. level should be designed such that for a high R ( 0 ) at the start,
R (k) is reduced to a sufficiently small valuefor the second SA
(with a step size KD) t o ,take over, and the corresponding
higher quantized magnitude shouldbe used for the large errors numberofiterations k should be as small as possible. As
during startup to increasethe convergence rate, while a lower mentioned previously, this corresponds to the design of an SA
quantized magnitude should be used for the small errors when with.a larger step size Lz times K D .From [4], for the region
approaching convergencetoreducethe residues. This is R (0) > R d 2 1, the number of iterations needed to take R (0)
exactly what (4) performs. to R d is approximately
. An alternative interpretation of (3) and (4) may be taken as
follows. Let L I = 1; then (3) isidentical to (2) whenthe
magnitude of r ( k ) is smaller than r ~and , hence the DSA is
working as an SA with a step size K D . On the other hand, if
the magnitude of r(k) is larger than or equal to r r , the DSA is
working as an SAwith a step sizeLz times K D .Therefore, the which is inversely proportional to LZKD, However, we cannot
DSA can be considered as two SA’S with two different step increase LzKD without bound, since forconvergence, the
sizes. The transition from one SA to the other is determined by following inequality must be satisfied [6]:
the threshold r ~Then . it is seen that the DSA has in a sense
resembled a two-step-size SIA. However, it should be noted
that in the two-step-size SIA, we represent r(k) by a
theoretically infinite precision number, while in the DSA, r(k)
is highly quantized for reducing hardware (as in the SA, a full-
fledged A/D converter isreplaced by ahardlimiteranda From (8) the maximum allowable value of LzKD is
sampler [I]). Moreover, while only polarity informationof the
error signal r ( k ) isused in the SA,size information is
incorporated in the DSA.
Notice that Duttweiler [5] has studied the so-called power-
of-2 multiplier as a nonlinear operation applied to the error
For this value of & K D , Nd is given by
signal:
Nd=R(O)-Rd. (10)
In practice, L ~ K isD chosen so that Nd is minimized and at the
where [ denotes the greatest integer less
e ] than or equal tothe same time a safety margin is allowed in fulfilling (8). Given
argument. Hence, the magnitude of r ( k )is quantized to one of L z K D ,the value of R ( m ) for this step size can be calculated
the values 1, 2, 4, 8 , etc., and that is why the multiplication is approximately by
named power-of-2.Itisseenthat both (4) and (5) execute
crude quantization to the error signal. However, there are
fundamental differences between the two algorithms. First,(4)
is only four-level
a quantization, while
the number of
R(m) =
a,
L2KDNa2 4
Authorized licensed use limited to: Chinese University of Hong Kong. Downloaded on September 18,2020 at 08:23:16 UTC from IEEE Xplore. Restrictions apply.
1274 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-34. NO. 12, DECEMBER 1986

which can be regarded as the R ( 0 ) for the second SA (with a 10 ’.

stepsize KD). Then the


number of iterations for the SIGN ALGORITHM (KSA = 1.61 x 10.‘)
convergence of the secondSA can be found using theresults of : : : : : : : : : ; : : NO. OF ITERATIONS
[4]. Furthermore, assuming that thetwo SA’s are working 2 3 4 5 6 7 8 9 10 11 12 (xlpo)
independentlyaboutthethreshold r T r the total number of
iterations for the convergence of the DSA may be taken to be
the sum ofthe individual numberof iterations of thetwo SA’s.
In view of this, a problem of “optimum sharing” between the
-lo tI \
two convergence characteristics exists, which is determined by
the size of L 2 K ~In . other words, we can choose a larger
L ~ K forD a faster convergence of the first SA; but since the
resulting R ( 0 ) for the second SA is then larger, it takesa
longer time for the second SAto converge. Conversely, if we
use a smaller L ~ K Dthe , first SA converges more slowly, but
the secondSA will take a shorter time to converge. If R (0) (for
the DSA) can be accurately estimated, the “optimum sharing”
maybefoundbyactuallycalculatingthe total number of Fig. 3. Convergence performance of the SIA, SA, and DSA (values of every
iterationsforafewtrialsof -the value of LZKD. If the ten iterations are used to plot the curves).
uncertainty of R ( 0 ) is, however, high, a larger L ~ K should D
be preferable, since it will ensure a fast initial convergence for vector transpose. Moreover, the signal/noise ratio is given by
a possibly large R ( 0 ) . 20 logLo(l/uu) dB,which is’set to 30 dB.
Since exact knowledge ofR (0) is not available in general, Fig. 3 showstheresult
it of thesimulationtestsonthe
is interesting to see how a deviation from the estimated R(0) convergence characteristics of the three algorithms. Identical
will affecttherequirednumber of iterations.Supposethe noise sequence and input signal sequence are applied to eachof
actual R (0) is greater than the estimated R (0) by AR (0).Thenthe algorithms in a test, and ten such tests are performed with
from (7), the number of iterations for the first SA (with a step different sequences. The resulting X(k) are averaged and then
size LZKD) is increased by the value theirlogarithmsaretakenasthefinalresultshown.Fora
desired R (00) of - 18 dB, the step sizes KslA and K S Aare
calculated tobe,respectively, 6.22 X and 1.61 X
In the DSA, the step sizeKD is set to be equal KSA to and L 1is
set to 1 as discussed above. The maximum value of L ~ K is D
calculated by (9) to be 0.04.We now choose L 2 K ~ = 0.03
It is then obvious that a larger L ~ K will D help to reduce the andthecorresponding R(=) is 9.787 dB.Adding 2 dB to
effect of the uncertainty of R ( 0 ) on convergence. R ( - ) we get Rdt = 11.787 dB, and from (15), rT is given by
The remaining parameter to be determined the in DSA is the 0.128.
threshold setting r T . Notice that R ( k ) is defined as R ( k ) = Fig. 3 shows that the SA converges very slowly towards its
u,(k)/u, and, hence, R z ( k ) = u:(k)/ui. Now at the final residual error compared to the SIA. Although the speed
threshold, R ( k ) has assumed a value given by (1 1), and in of convergence can be increased by using a larger step size
practice we usually add a few decibels to this value and use the (=0.03 as shown), it will result a larger residual error. The
resultant (denoted Rdl)in the design. Hence we obtain, at the performance of the DSA i s clearly much better than that ofthe
threshold, SA in terms of convergence. Comparing with the SIA, the
DSA converges even faster during startup (as predicted) and’
o f ( k )= R2,u:. (13) attains the same residue withina similar number of iterations.
But d ( k ) can be shown to be equal to E { r 2 ( k ) } - u i and V . CONCLUSIONS
therefore, from (13),
Asimplemodification of thesignalgorithmhasbeen
E { r 2 ( k ) }= (1 + R:t)u: (14) proposed and analyzed in the context of two sign algorithms
working in cooperation. A practical design method has been
which is the mean square value of r(k) at the threshold. A derived using the results of a recent study. It has been shown
rational choice of rT is thus given by by computer simulation that the convergence characteristic of
thenewalgorithmhasagreatimprovementoverthesign
algorithm
andis
comparableto
the
stochastic
iteration
algorithm.

IV . SIMULATION RESULTS REFERENCES


Computer simulation is usedto compare the performance of [l] N. A. M. Verhoeckx, H. C. vanden Elzen, F. A. M. Snijders, and P.
the SIA, SA, and DSA when they are employed in an adaptive J . van Gerwen, “Digital echo cancellation for baseband data transmis-
FIR filter for the identificationof an unknown system (Fig. 1) sion,” IEEE Trans.Acoust., Speech, Signal Processing, vol. ASSP-
27, pp. 768-781, Dec. 1979.
whichisnowconsidered tobeatypicalcommunication
121 R. D. Gitlin, J . E. Mazo, and M. G. Taylor, “On the design of gradient
channel. The channel impulse response is taken from [7] and is algorithms for digitally implementedadaptive filters,” ZEEE Trans.
given by g = 10.33 0.67 1.0 0.67 0.331. The input samples Circuit Theory, vol. CT-20, pp. 125-136, Mar. 1973.
a(k) are statistically independentbinarynumbershaving [3] B. Widrow et a/., “Adaptive noise cancelling: Principlesandapplica-
values 1 or - 1 with equal probability, and the additive noise tions,” Proc. ZEEE, vol. 63, pp. 1692-1716, Dec. 1975.
sequence is white Gaussian with zero mean and variance u i . [4] N. A. M. Verhoeckx and T. A. C. M. Claasen, “Some considerations
The performance measure for each algorithm is taken to be the on the design of adaptivedigital filters equippedwiththe sign
error defined by algorithm,” IEEE Trans. Commun., vol. COM-32, pp. 258-266,
Mar. 1984.
W ) = (g- W)) . (g - 4 k ) )’ (16) [5] D. L. Duttweiler, “Adaptive filter performancewithnonlinearitiesin
thecorrelation multiplier,” IEEE Trans. Acoust., Speech, Signal
where the dot denotes inner product and the prime denotes Processing, vol. ASSP-30, pp. 578-586, Aug. 1982.

Authorized licensed use limited to: Chinese University of Hong Kong. Downloaded on September 18,2020 at 08:23:16 UTC from IEEE Xplore. Restrictions apply.
IEEE Tk4NSACTIONS ON COMMUNICATIONS, VOL. COM-34, NO. 12, DECEMBER 1986 1275

[6] T. A. C. M. Claasenand W. F. G. Mecklenbrauker, “Comparison of STRUCTURE


11. GENERAL AND METHODOLOGY
the convergence of two algorithms for adaptive FIR digital filters,”
IEEE Trans. Circuits Syst., vol. CAS-28, pp. 510-518, June 1981. Our approach is to consider the source fileas text and avoid
[7] F.R. Mageeand J. G. Proakis,“Adaptivemaximumlikelihood extenske programming by using UNIX-provided text editors
sequence estimation for digital signaling in the presenceof intersymbol to modi‘& the contents. The commands borrowed from UNIX
interference,” IEEE Trans. Inform. Theory, vol.IT-19,pp. 120- are:
124, Jan. 1973.
grep A family of commands that search the input files for
lines matching a pattern and copy those lines to the
standard output.
sed Anoninteractivestreamtexteditor.
An Approach to Programmable Signal Processor awk Apatternscanningandprocessinglanguagewhich
Assemblers and Simulators searches a setof files for patterns performs
specified
actions upon lines or fields of lines which contain
TERESA H.-Y. MENG AND DAVID G. MESSERSCHMITT instances of those patterns.
A . Implementation of the Assembler
Abstract-A method of programming an assembler and simulator for a Anassemblergeneratesmachinecodeofatarget PDSP
programmable digital signal processor (PDSP) is proposed. This method
correspondingtoagivenassemblylanguageprogram.Our
assembler is amnemonictranslatorwithno
is general and simple to execute, and makes use of the UNIX” utilities facilities for
macros, link loading, etc. In our case, the source program a,
awk, sed, and grep to translate a source program, consisting of assembly as
whole, including assembly instructions and optional added C
instructions and C statements, toeither machine code forthe assembler or
a C program for the simulator.The statements (to be included
approach also provides the in the simulation), is regarded as
lines of character strings. The grep command extracts every
advantages that little programming effort is required to generate a new
assembler and simulatorforanother processor instruction set. It alsoline that is recognized as a particular assembly mnemonic(this
allowsaPDSPto be simulated in the contextofa is repeated for each mnemonic), sed translates the mnemonic
larger system
to anappropriatemachinecode,and
containing other PDSP’s, other hardware, and an external environment. awk generatesthe
A disadvantage is that the approach is applicable onlyto the UNIX hexadecimal representation to be storedin the program ROM.
operating system. Indirect addressing, labels, constant definitions, expressions,
and dynamic memory allocationsare simple ways to represent
I. INTRODUCTION addresses of the data RAM in an instruction, as provided by
There are now available several single-chip programmable assemblydirectives,andarerealizedbysomespecial awk
digital signal processors (PDSP’s) [ 11-[6]. The programming commands.Imbedded C statements(forthesimulation)are
of a new assembler and simulator for a newPDSP instruction ignored by the assembler.
setisgenerallyconsideredtobeamajorundertaking.We The three major steps in assembly are described in more
describe here an approach that avoids extensive programming detail in the following three subsections.
effort, and allows an assembler and simulator to be generated I ) Separation of AssembIy Instructions from C State-
in typically less than a day’s effort. In addition, the approach ments: The assembler first removes imbedded C statements.
allows the PDSP to be simulatedin the context of a simulation Our convention is to add an identification“I**/” at the startof
of the systemin which it isembedded, including the simulation eachlinecontaining a C statement,andtheselinesare
of multiple PDSP’s, associatedhardware,andtheexternal removed by the sed command “ / \ / \ *.* \ * \ / / d ” .
environment. 2) Label Translations: Branchinstructionsrequire that
We avoid most of the programming by combining existing labels be translated into exact program locations. This can be
UNIX@ operating system [7] utilities. UNIX utilities translate accomplished by parsing the source program twice. In the first
a source program for the targetPDSP to machine code for the pass, each assembly instruction is assigned a program address
assemblerandgeneratea C programforthesimulator. (by an awk program) which is placed at the beginning of each
Simulation is accomplished by compiling and executing the line. This same awk program generates a sed command file,
simulator C program that was generated. Themain advantage which contains statements like
of the approach is its universality and the ease with which new
assemblers and
simulatorsfor
different PDSP’s can be s/label/address/.
generatedfromanexistingassemblerandsimulator.Also, On the second pass this command file is used to replace the
debugging C statements (not executed by the PDSP’s) can be branch destination in the source file with the corresponding
added to theassemblysourcefile,andthedebugging of address. Subroutine calls are treated in the sameway to obtain
assembly code is aided by the symbolic debugging facilities of theaddressesofsubroutines.Morecomplicatedbranching
UNIX . expressionssuchasargumentpassinginsubroutines or
The TI TMS32010 PDSP was used as an example. Starting destinationaddressesrelative to the instruction location are
with theTMS32010assembler andsimulator,ittookonly easily implemented by an awk command.
about 5 h to generate a simulator for the Fujitsu MB8764 and 3) Machine Code Generation: Inthe final step,each
10 h for the TMS32020. In Section I1 we describe the basic assemblyinstructiongoesthroughapipelinedsequence of
approach, and Section 111gives conclusions. A basic familiar- commands such as
ity with UNIX utilities and the C language is assumed.
egrep ‘instrl I instr2 I ..‘
input- file I sed - f sed. instr I awk - f awk. instr
Paperapproved by theEditorforTransmissionSystems of theIEEE
Communications Society. Manuscript received September 30, 1985; revised where instrl, instr2, etc., are the assembly mnemonics that
April15,1986.Thisworkwassupported by theSemiconductorResearch have the same instruction field format. Because the number of
Corporation. different instruction formats is usually quite small (seven for
TheauthorsarewiththeDepartment of ElectricalEngineeringand the TMS32010), the assembling process needs only a few lines
Computer Sciences, University of California, Berkeley, CA 94720. of command sequences, as described above.
IEEE Log Number 861 1280. The egrep command passes only the lines in the input file

0 1986 IEEE
0090-6778/86/1200-1275$01.00

Authorized licensed use limited to: Chinese University of Hong Kong. Downloaded on September 18,2020 at 08:23:16 UTC from IEEE Xplore. Restrictions apply.

You might also like