You are on page 1of 7
Techniques for High-Speed Implementation of Nonlinear Cancellation Sanjay Kasturia, Member. IEEE, and Jack H. Winters, Senior Member, IEEE Absract—Nonlinear cancellation (NLC) can significa sauce intersymbol interference ISD) in Tongrhaul direct tion fiberoptic systems and can thereby “lsperson-imited data rate a bby NLC Is achieved by suber previously detected symbols, {ulres previous deci snd days in the feedback ‘ata rate ofa detector operating with NLC. In this paper, we resent techniques for breaking the bottleneck caused Dy the Feedback lop. First, we show boo ta simplify the loop to avoid highspeed hitching of analog signals by using multiple dct son elements, each with different threshold evel. We then sow how (o se lookahead computation to increase the delay permissible inthe feedback loop. These techniques permit NLC {o be implemented in integrated-ireuit form at rates Timited only bythe detector sithing speeds. These techakques are aso ‘reful in other applications requiring speedup of fedback loops With decision elements tn the loop (eg. decision feedback uatzers) 1. Inrropvcrion fONLINEAR cancellation (NLC) can greatly reduce patter dependent intersymbol interference (ISD in a receiver by sublracting interference caused by previously detected symbols. With the rapid development of optical Amplifiers, ISI becomes a major factor limiting data rates in long-haul direct detection fiber-optic systems: NLC of. fers the potential of increasing the maximum data rate of these systems. This potential can only be realized if we devise techniques for implementing NLC. preferably in Integrated eieuits, a the high rates desired (gigabits /s). Previous papers [1], [2] have studied the reduction of ISI achievable with NLC. It was shown in [2} that NLC ‘ean more than double the maximum data rate of gigabits per second (Gb/s) lightwave systems by canceling post ‘cursor ISP. Because NLC is # nonlinear technique. it can handle ISL that arses from a variety of impairments, such as transmit laser chirp, chromatic dispersion, polarization ‘ispersion, and nonideal receiver frequency response. [NIC can also reduce ISI caused by bandwidth limitations The attr ae wih the Netown Syuers Resch, Deporte, "WSt caused by symbols preceding the one curemy atthe decision el tice yt caning he decom tres snlor 8) ing» ely pia egeiae 3). 18. a el of the detecor itself (when itis operated at igh data fats), thereby increasing the maximum wsef operating fate ofthe detector. However, NLC, as described it [1], {2}, is dificult wo implomett at high data rates. Fig. | shows a simple nonlinear canceler. We see thatthe can celer in Fig. | requires an analog signal tobe switche at the data rate in the feedback loop. Such a system i i fcul wo implement because switching tasiens tend 0 comipt the analog signal. In addition, propagation delays fof elements in the feedback loop limi the maximum data rate ofthe detector with NLC. Even though the compar~ tor may beable to operate faster the dats rate is Kime tothe inverse ofthe maximum propagation and process- ing delay around the feedback loop because the signal being fed back must ative atthe comparator before the rextayibol. This speed limitation, imposed bythe pre ence of feedback. canbe expressed as an iteration bound? {Gee Pah and Messerschmit []). For NLC tobe useful in hgh data rate systems, these problems must be ove In his paper, we develop implementation techniques for NLC that eliminate these problems atthe expense of increased circuit complenity. Integrated cigcuits for de Cision feedback based equalization at rates of several tizabits/s willbe easly implemented Using the tech- niques developed in this paper. Multiple detectors Q', ubteo og deen elo and CN by rege {oral be compan ite save op 1) where W is the numberof bits fed hack), each witha dif- ferent threshold level, are used to avoid the switching of analog signals. The selection of the threshold level (ie the feedback signal) occurs after threshold detection. This is accomplished by choosing the output of the detector corresponding to the correct threshold rather than by Switching analog signals. The selection algorithm is then iterated L times so that the maximum delay permissible in ‘generating the feedback signal is increased from one sym bol period to L symbol periods. We then pull some of the ‘computation outside the loop (using lookahead computa tion) to minimize propagation delay inthe feedback loop. With these techniques, NLC can be easily implemented ‘on the detector chip to inerease the maximums data rate of Gb/s communication systems andior detectors In Section Il, we briefly describe nonlinear cancella- tion, and outline some of the problems of implementing NLC at high data rates. Techniques for overcoming these problems are developed in Section Il and are ilustated ‘with the example of a simple one-tap NLC. Section IV briefly describes how to generalize the techniques to more complex NLC's and presents some more exatnples. Sec- tion V summarizes this paper. I. Tir Nowuisear Canceten Fig. 1 shows an implementation of NLC as discussed in (2). The threshold detector compares the received si: nal 10 the threshold (V,) to determine the output bit dy. Before making the comparison, the NLC generates a cor rection signal f(d) based on N previously detected bits? ‘ “This signal is fed back to the input to compensate for the ISI caused by bits transmitted price tothe bit currently at the detector. We will make the assumption thatthe pre vious bts have been detected correctly, and will fo con venience, use a, instead ofthe more correct d, to denote the detected bts. The reader should note thatthe assum tion of correct prior decisions ensures that NLC will re duce ISL. Whereas this assumption is not entieely correc. NLC does significantly improve performance in spite of ‘occasional (though rage) errors in the Feedback path. This issue of errors in the feedback path has been addressed by others [7], [8], and we will not dwell on it any further. Before proceeding further, we will modify Fig. 1 by ‘moving the feedback signal tothe threshold input of the ‘comparator as shown in Fig. 2. This is done to avoid in teraction between the received and feedback signals. Here we see thatthe feedback signal caa take one of many val ues, none of which is related tothe voltages used t0 rep resent the logic values, At low data rates, such a signal can be generated by a random access meinory followed. by a digital-to-analog converter (see [9]). However, this, 4,9) py aia oncom Je ftd = 2 wa) se scngue is etened oa decision felt eqaliaton 'DFE) 6 Becase DFE — oat ; i) technique is impractical at rates of Gb/s, Swartz has de- vised a circuit [10] for a vatiale gain latch to be used with the NLC (at Gb/s rates) as shown in Fig. 3 (for the case of V = 1). The output of the variable gain latch Q switehes between Vy and V, which are the appropriate threshold values corresponding to a previously detected zero or one. Q must have a large dynamic range, should not have any large transients, and must have low noise levels. These are severe constraints, and practical circuits may be unable o satisfy all of them at Gb/s data rates This approach also becomes more dificult co implement as more bits are used in the feedback path (larger N) be- cause then the latch must be able to switch between 2” walues. Apart from the problem of generating the analog feed- back signal, NEC, as shown in Figs. 1,2, and 3, has its speed limited to the reciprocal ofthe total propagation and processing delay around the feedback loop. Basically, the ‘decision on the mth bit has to propagate around the feed back loop and appear atthe input tothe comparator before the comparator begins processing the next symbol. Fig. 4 ‘depicts the timing requirement pictrially, The spoed lim: itation imposed by the feedback loop in NLC is severe because the propagation delay of just the threshold detec- {or itself ean be greater than one symbol petiod in high- speed detector [11]. [12]. The timing constraints imposed by the presence of feedback will also reduce the clock: jitter tolerance ofthe system. eae rpms EE /as IIL. Inruemexrarion Tecwsques A. Avoiding Analog Siitching A look at Fig. 2 reveals that at most 2° distinct thresh= old values may be fed back to the comparator. One way ‘of avoiding this threshold feedback is to use 2" distinc. comparators, with each using one of the 2% possible threshold values. Now the problem of feeding back the correct threshold (which depends on the previously de- ‘ected bits) is replaced by selecting the comparator which uses the correct threshold value. Fig. 5 shows a block diagram of such a system; note that we have assumed only one feedback tap (V = 1), and hence we need «wo com- parators. Fig. $ shows a 2:1 multiplexer being used 10 Select heween the two comparators. Yj and V, are the wo threshold values corresponding to the previous bit being a zero or a one. Fig. 6 incorporates 2 gate-level imple mentation of the 2:1 multiplexer. An integrated cireu implementation may combine the gates ofthe multiplexer With the gates implementing the latch inthe D flip-flop. ‘Although Figs. $ and 6 avoid analog feedback, they have the same time available (7) for processing and prop- gation around the feedback loop as the implementation in Fig. 3. However, the threshold detector and variable ign latch are no longer in the feedback loop. Instead. we have a multiplexer; because the propagation and process ing delay is usually less for a multiplexer than for the ‘components it has replaced, the system shown in Fig. 5 should be able to operate faster than the system in Fig. 3 ‘The next subsection shows how to increase the time per- ‘mitted for processing and propagation in the feedback Toop, B. Eliminating the Feedback Imposed Bortleneck Going back to Fig, 5, we see that the mth bit is given by Angas + Bal o Fie 6 Removing alg fests: gate evel where 4, and B, are outputs ofthe two comparators at the zh symbol period, and the overbar denotes logical com plement, Note that all the variables are Boolean, with val les 2e70 or one. If we write out the corresponding equa tion fora, ,. we have ayy = Ae yah ® Using (2) to substitute fora, in (1), We get fy = Ay Ay y + By + BAL dy = Ay + Bey) On—2 + BBy Vy» 3 filtay > + ln, @ ABs where Silt) = Aya + By Silt) = AyByy + ByB ys Equation (3) represents a, in terms ofthe 4°s, Bs, and a, whereas (1) represents din terms of dy. The mar jor significance of this difference is that an implementa- tion based on (3) will permit a total propagation and pro cessing delay of 27° in the feedback loop. wheress an ‘implementation based on (1) would permit only 7. Equa- tion (3) looks a lot more complicated than (1), and one ‘might assume that although it permits a larger delay. the increased complexity will ake up the increase in permit {ed delay, resulting in a circuit which may be as slow as ‘one implementing (1), However, ths isnot the ease be- cause f(n). and f(n) can be computed outside the feed back loop (lookahead computation) and then be provided as inputs to the Feeback loop. The feedback loop itself Fi. 9 A ede agra, ‘must just implement (4) see Eig. 7], which is quite sim: le and is identical to the feedback loop in Fig. 6. ‘The extta delay available inthe loop in Fig. 7 may be used in two ways. Fist, we ean use ito adda latch within the multiplexer to pipeline the multiplexer and increase its speed of operation. Second. it ean be used to reduce the clock rate within the loop to half the data rate. In the later approach, the reader must note that (3) represents a, in terms of «,_s, and therefore if we use this equation repeatedly, string with ay, we will compute ds. {y. To compute the missing terms, which are the dag «8 ‘We mus iterate the same equation, but this time we must do it starting with a. Hence. we must replicate the feed back loop twice. Such a system, implementing (4) si lustrated in Fig. 8. Fig. 9 shows the same system at the ate level with a few simple changes to aid implementa tion. In an integrated circuit implementation, the gates for the multiplexer may be partly inconporated in the gates ‘used to implement the D flip-flop which follows the mul- tiplexe. We have just illustrated a technique to increase the pet= tmissible processing and propagation delay’ in the feedback loop from Tto 27 without increasing the circuit comple ity ofthe feedback loop. We did this by “erating” (1) twsce. Similarly, by iterating (1) L times, we can increase the permissible delay to LT. and have Z loops operating inparallel. Parhi, Meng, and Messerschmit [3], [13] have used similar techniques. i.e, iterating a recursive rela tionship, o avoid feedback-imposed speed limitations for recursive and adaptive filters. However, their techniques ste limited (0 instances of linear recursion. The vector. zation of the joint process estimator caried out in [13] ‘hich my be ppl breaks down with the introduction of decision feedback ‘Our approach to the problem is capable of handling re- ceursion with decisions in the feedback loop, 1) Sharing Some Circuitry: In Figs. 8 and 9. the cit- ‘uit to precomputef; and fy has also been replicated. If this combinatorial circuitry is fast enough to operate atthe symbol rate T, we can share this circuitry between the feedback loops instead of replicating it. This concept is ilysteated in Fig. 10. If necessary. we can pipeline [14] ‘his combinatorial circuitry to speed it up. Pipelining (by adding flip-flops between serial processing elements) will, ‘add tothe propagation delay and will increase the overall, Tatency through the receiver but it will not result in a data rate limitation because this circuitry isnot inthe feedback path IV, Generatizarions. In the preceding section, we illustrated an L-fold in- «crease (with examples for '= 2) of the permissible delay inthe feedback loop for a NLC with one-tap feedback. For the more general case, with N feedback taps, 2" dis- tinct threshold values are possible. To avoid analog feed buck, we must replicate the comparator 2” times. If Ay, denotes the output of the ith comparator during the mth symbol period, a, is given by 0, Ain And yy an 8 To increase the permissible delay in the feedback loop to LET, we must iterate (5) L times. Each iteration involves selecting the a with the highest subscript from the right side of the equation, and then substituting for it in terms of previous bits using the relationship of (5). Afler iter ating L times, we can expand the right-hand side into a ‘sum-of products form and group the terms as shown be- Tow: 2 = font Menten + AOD 1ty 1-4 0 Btoner + 1 BoB Bactt *T-toner 6 1s where each f(m is @ function of Ayn. Aina °° 5 Ads tome) = 1, +" 2 The ft) can be precom- pated ouside the feedback loop. Basically the nth output bit is given by one ofthe f with the selection based on previous bits delayed by L symbol periods. Note that to ‘write the iterated (5) inthe form shown in (6), we fist expand the righ-hand side of the iterated equation into a sum-of-produets form. A product temm which contains ‘more than one occurrence of an a, oF is inverse is then simplified to contain only one occurence. This is always possible because the a are zero oF one. aja, is equal toa, And a, equals @, and any product terms which have 1 ate dropped because this is always zero. ek: parle oops. ‘We could alternatively rewrite (6) as the sum of a se: lection of terms from its right-hand side: 4, = fir) o ‘where (i ~ 1) is given by the Abit binary representation [rather than the logical expression as in (6)] PT ay tty tn ® ‘As in Section II, the L-fold increase in the permissible loop propagation and processing delay can be used either to pipeline the multiplexer, oF to reduce the clock rate in the feedback loop. With the second approach, the feed back loop itself must be replicated Ltimes, with each loop running at 1/Lih the symbol rate. Because the loops run at 1/L times the symbol rate and are n0 more complex than the original loop, our technique can achieve an Lfold increase inthe speed of operation ofa receiver with NLC. The precomputation ofthe f(n)s can be pipelined $0 the combinatorial circuitry is shared between the L feedback loops. Figs. 11-13 sketch the applications of our techniques to a system with two-bit feedback. Al- though we have assumed binary symbols inal ofthe pre- vious sections, asa further generalization, our techniques can be easily extended to Mary signaling \V. Sunistany ap Coxctuston In summary, by using multiple detectors with digital logic, we can eliminate the analog switch and propagation delay problems of nonlinear cancellation and decision feedback equalization. 2” detectors are required with N feedback bits. but the additional circuitry required for higher data rates is all-digital logic and thus easily ame- rable to integration, Thus, we can achieve higher speed at the expense of increased circuit complexity, Wit these techniques, NLC and DFE can be easily implemented on the detector chip to increase the maximum data rate of Gb/s communication systems and/or detectors Rerenences [HTB Bigies A. Geno, RD. Gili and TL, Lim, “Api am 121 101, Wee aad RD. Gin, “Elec signal processing tc toes in mg a erp nema EEE fant Comma. vt Sepp stash Sop oe, (31 SH Wines nd MO Sito, “agent geliaton ofp 161 i Winer Egatzation acter ihre sysems wing + tra icf Lanne eat a 801. 151 Rx Pan ad. G. Mesenchmit, Abit parle it eel {61 J.G. Broais Digit Commanicarions, IPL DL, Dole. J. Mao. . G. Menenstnit, An per ‘hun he er pay adi eck uals" TEEE Pans Ir Psy, 0h 1720. pp. 290-359 ely 8 18) R.A. Keanedy. BD. 0. Andon, snd RR, itmend, “Tight ‘wounds on the mor probabilities of decison feedback equalizer, (0) K.D. Fisher JM Cod, and C. Ne Mls, "An apie deison Feedhack equalizer fr tre chamel suflering fom sine ISL. Pre It Co Comma Boson MA, dae 1989. 9. 33.7.1- (10) KD Gi, RG. Smarz, end. H. Wines Ii) ATAT LOWHSAX Decision Cie 1.7 b/s, Pinay Data {121 Gigaic Lope 1060214. Data Soe. 113) T.Meng and DG. Messerschmi Abia high sling ae dap Rr EEE Tras Sigal Process Wl 38, pp ASS (0411 8 Japp and 5, Aw. “Etecve iain of itl STE, ete neti ok furmene a ATAY Bal Earn, Hold ok H. Waters $37-M°82-S488) wa {nd fom 1977 4 19H withthe BetoScence fray, Sie 1981, he hasbeen with ATAT Bell Laboratories, Holmdel ‘Mig kine ae cae gern Ha es 7 tapercondactorsm commusiaton systema, Currently. he 3 volved "Be Winer 2 mena of Si Xi He fected The Ohi She Un scrip EeaScene Labo» svat for ousting dseraton

You might also like