You are on page 1of 5

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO.

6, JUNE 2015 1145

Transactions Briefs
A CMOS PWM Transceiver Using Self-Referenced Edge Detection
Kiichi Niitsu, Yusuke Osawa, Naohiro Harigai, Daiki Hirabayashi, Osamu Kobayashi,
Takahiro J. Yamaguchi, and Haruo Kobayashi

Abstract— A CMOS pulsewidth modulation (PWM) transceiver circuit


that exploits the self-referenced edge detection technique is presented.
By comparing the rising edge that is self-delayed by about 0.5 T and the
modulated falling edge in one carrier clock cycle, area-efficient and high-
robustness (against timing fluctuations) edge detection enabling PWM
communication is achieved without requiring elaborate phase-locked
loops. Since the proposed self-referenced edge detection circuit has the Fig. 1. Conceptual diagram of the conventional PWM receiver in 2-bit PWM.
capability of timing error measurement while changing the length of self-
delay element, adaptive data-rate optimization and delay-line calibration
are realized. The measured results with a 65-nm CMOS prototype
demonstrate a 2-bit PWM communication, high data rate (3.2 Gb/s), and
high reliability (BER > 10−12 ) with small area occupation (540 µm2 ).
For reliability improvement, error check and correction associated with
intercycle edge detection is introduced and its effectiveness is verified by
1-bit PWM measurement.
Fig. 2. Conceptual diagram of the proposed PWM receiver in 2-bit PWM.
Index Terms— CMOS, jitter, pulsewidth modulation (PWM),
self-referenced, transceiver.
I. I NTRODUCTION
The requirement of decreased power supply voltage for CMOS
device scaling motivates us to develop time-domain circuits. To
develop time-domain circuits, the pulsewidth modulation (PWM)
scheme is a promising approach and widely used for many appli-
cations, such as wireline transceivers [1]–[3], CMOS imagers [4],
and biosensor array [5]. However, conventional PWM transceiver
design [1]–[3] requires large-area and power-hungry phase-locked
loops (PLLs) for multiphase sampling clock generation.
This brief describes the design of an area-efficient and highly
robust PWM transceiver using the newly proposed self-referenced
edge detection. The proposed PWM transceiver exploits a timing
comparison between the rising edge that is self-delayed by about
0.5T and the data-modulated falling edge in one carrier clock cycle, Fig. 3. Circuit implementation of the proposed PWM transceiver architecture
and this mechanism is introduced. Adaptive data-rate optimization in 2-bit PWM.
and error check and correction (ECC) technique are also introduced. Section II. Section III introduces adaptive data-rate optimization.
This brief is organized as follows. The proposed PWM trans- Section IV presents ECC. The test chip design and the measurement
ceiver using self-referenced edge detection is introduced in results are summarized in Sections V and VI. Discussion is described
Manuscript received October 15, 2013; revised February 26, 2014; accepted in Section VII. Section VIII concludes this brief.
April 8, 2014. Date of publication May 29, 2014; date of current version
May 20, 2015. This work was supported in part by Japan Science and II. PWM T RANSCEIVER U SING S ELF -R EFERENCED
Technology, in part by the Ministry of Economy, Trade and Industry, in part
by Starc and VLSI Design and Education Center, in part by the Grant-in-Aid
E DGE D ETECTION
for Young Scientist (B) under Grant 24760266, in part by the Grant-in-Aid Figs. 1 and 2 show a conceptual diagram of the conventional
for Scientific Research (S) under Grant 20226009 and Grant 25220906, and proposed receiver design for a 2-bit PWM. A conventional
and in part by the University of Tokyo in collaboration with Cadence Design
Systems, Inc.
PWM receiver [1] exploits a PLL for generating time-shifted sam-
K. Niitsu is with the Department of Electronic Engineering and Com- pling clocks. Our newly proposed PWM receiver using a self-
puter Science, Graduated School of Engineering, Nagoya University, Nagoya referenced edge detector employs latches as timing comparators and
464-8603, Japan (e-mail: niitsu@nuee.nagoya-u.ac.jp). a thermometer-binary decoder. The proposed self-referenced edge
Y. Osawa, N. Harigai, D. Hirabayashi, and H. Kobayashi are with the detection utilizes the self-delayed rising edge instead of the PLLs
Division of Electronics and Informatics, Faculty of Science and Technology,
Gunma University, Kiryu 376-8515, Japan (e-mail: t09306014@gunma-u. sampling clock. By comparing this self-delayed rising edge and the
ac.jp; t12801645@gunma-u.ac.jp; k_haruo@el.gunma-u.ac.jp). data-modulated falling edge in one carrier clock cycle, edge detection
O. Kobayashi is with STARC, Yokohama 222-0033, Japan. can be realized. The proposed structure removes the elaborate PLL;
T. J. Yamaguchi is with Advantest Laboratories Ltd., Sendai 989-3124, hence, high area efficiency is obtained.
Japan (e-mail: takahiro.yamaguchi@jp.advantest.com).
Color versions of one or more of the figures in this paper are available
Fig. 3 shows the schematic diagram of the proposed transceiver
online at http://ieeexplore.ieee.org. architecture. The transmitter consists of a duty controller, four delay
Digital Object Identifier 10.1109/TVLSI.2014.2321393 elements with length of multiples of modulation factor, T , selector
1063-8210 © 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted,
but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1146 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

Fig. 6. Schematic diagram of the proposed receiver with ECC capability for
Fig. 4. Conceptual diagram of the adaptive data-rate optimization and 1-bit PWM.
delay-line calibration associated with timing jitter measurement using self-
referenced edge detection.

Fig. 7. Schematic diagram of the proposed ECC circuit containing latches


and error detecting logic for 1-bit PWM.

Fig. 5. Conceptual image of the proposed ECC circuit.


edge detection. Fig. 5 shows the conceptual image of the proposed
ECC technique. The auxiliary receiver detects the intercycle time
controlled by the PWM modulator, and OR logic. The receiver difference of the adjacent falling edges, as shown in Fig. 5. Delay
consists of three delay elements with various lengths about a half of length design of the auxiliary receiver can be determined by the
the clock period, 0.5T , latches, and a thermometer-binary decoder. autocorrelation function of the jitter of the carrier clock [7], [8].
The outputs of the three latches are thermometer codes that are In typical situation, correlation decreases as the number of cycles
converted to binary code by the thermometer-binary decoder. between the preceding and succeeding edges increases. Thus, a larger
number of interreference cycles are expected to be effective for
III. A DAPTIVE DATA -R ATE O PTIMIZATION error detection. However, to be large number of interreference cycles
causes area penalty. Therefore, we have to consider the tradeoff
Fig. 4 shows a conceptual diagram of the adaptive data-rate
between ECC effectiveness and area overhead in designing the ECC
optimization and delay-line calibration. The receiver has the same
circuits.
configuration as the reference-clock-free timing jitter measurement
Fig. 6 shows the conceptual image of the ECC function in 1-bit
circuit using self-referenced clock with nT delay [6]. In [6], nT -delay
PWM that is enabled by comparing the received data from the main
generates
√ two uncorrelated edges. By comparing two uncorrelated receiver and that from the auxiliary receiver. Since the implemented
edges, 2-times timing jitter can be obtained. Thus, timing error
ECC uses only 1T -delayed edges for error correction, successive
measurement is realized by only changing the length of the delay.
errors cannot be recovered. As stated above, instead of using
Based on the obtained timing error information of the carrier clock,
1T -delay as in Fig. 6, 0.5T -delay (comparing with the latter rising
modulation factor, T , and the carrier clock frequency can be opti-
edges instead of the earlier rising edges) or a longer delay is possible.
mized. Since the timing error of the carrier clock primarily determines
Thus, the proposed technique is flexible and can be optimized under
the bit error rate, smaller jitter enables both higher modulation and
the characteristic of timing error.
a higher carrier clock frequency, which results in a higher data rate.
Fig. 7 shows the schematic diagram of the ECC circuit for 1-bit
Moreover, the delay lines for communication can be calibrated while
PWM. The ECC circuit consists of latches and an error detector.
processing the timing error measurement.
ECC utilizes the successive received data from the main receivers,
D1 and D2, and data from the auxiliary receivers, A+ and A−.
IV. E RROR C HECK AND C ORRECTION An error code is generated when a discrepancy between the received
This section introduces the ECC circuit for improving the per- data from the main and auxiliary receiver occurs, and the original
formance of the proposed PWM transceiver using self-referenced RX bit is inverted.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015 1147

Fig. 10. Measured waveform of the carrier clock, transmitted and received
data, and error code under nonsuccessive 1-bit error. ECC operation was
Fig. 8. Microphotograph of the test chips that are fabricated in 65-nm CMOS. successfully verified. Data rate is 3.2 Gb/s, and carrier clock frequency is
3.2 GHz.

Fig. 9. Measured waveform of the transmitted and received data with 2-bit Fig. 11. Measured waveform of the carrier clock, transmitted and received
PWM. A shifted PRBS was utilized, and RX data are inverted due to the test data, and error code under 2-bit successive error. Toggle pattern is utilized
chip design. Data rate is 3.2 Gb/s, and carrier clock frequency is 1.6 GHz. for verification. The proposed ECC cannot recover 2-bit successive error as
designed. Data rate is 3.2 Gb/s, and carrier clock frequency is 3.2 GHz.

V. T EST C HIP D ESIGN AND M EASUREMENT S ETUP Fig. 10 shows the measured waveform for verifying ECC operation.
To create a nonsuccessive 1-bit error situation, a carrier clock with
To verify the effectiveness of the proposed technique, a test chip large timing jitter of 1/16 probability was utilized. Large timing jitter
was fabricated using 65-nm CMOS technology, as shown in Fig. 8. was generated by changing the duty cycle. The ECC successfully
The footprints of the circuits are 20 μm × 27 μm (2-bit PWM, recovered the received data, and generated an error code with a
T = 15 ps, without ECC) and 20 μm × 39 μm (1-bit PWM, probability of 1/16.
T = 30 ps, with ECC). To confirm the multibit operation and Fig. 11 shows the measured waveform for demonstrating the
ECC feasibility separately, two types are implemented and tested. limitations of ECC operation. To create a 2-bit successive error
The common offset of the inverter-based delay line was shared situation, a carrier clock with a large timing jitter of 2/16 probability
by all delay lines for compact layout. An on-chip interconnect (two successive times every 16 cycles) was utilized. The ECC could
with a length of 3.2 μm was implemented as a communication not recover the received data as designed.
channel. The target application includes mobile application [3] and Performance summary is shown in Table I. Since the proposed
large-scale sensor array [4], [5], where communication channel is PWM transceiver does not require PLL nor delay locked loop (DLL)
short. circuits, the circuit area can be minimized. Even if considering
The input carrier clock and TX data were fed into the circuit technology scaling from 0.18-μm CMOS to 65-nm CMOS, the
by a BERTS (Agilent 81250, timing jitter is 1.6-ps RMS). Input proposed circuit is competitive from the viewpoint of area efficiency.
and output signals were captured with a sampling oscilloscope The BER measurement has been performed by carrier clock with
(Tektronix DSA71254B). lower jitter than these in other related works. However, the state-of-
the-art PLLs [9], [10] can generate subpicosecond jitter clock. Thus,
VI. M EASUREMENT R ESULT by associating with the low-jitter clock generator in TX, the proposed
Fig. 9 shows the measured waveform of the input and output signal. technique can be feasible without PLL or DLL in RX.
A carrier clock and shifted 2-bit pseudorandom bit sequence (PRBS)
VII. D ISCUSSION
were fed into the test chip. Data communication was successfully
verified with data rate of 3.2 Gb/s (=1.6 Gb/s/bit × 2 bit). The data A. Another Topology of the Proposed CMOS PWM Transceiver
rate is determined by the measurement setup. The maximum data Using Self-Referenced Edge Detection
rate of BERTS (Agilent 81250) is 3.2 Gb/s. High reliability with a This section introduces another topology of the proposed CMOS
BER < 10−12 was also verified with a BERTS. PWM transceiver using self-reference edge detection. Instead of
1148 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015

TABLE I compatibility with the spread spectrum clocking technique by


P ERFORMANCE S UMMARY comparing with the conventional techniques.

C. Dynamic Range of the Proposed Transceiver


This section provides analysis on dynamic range of the proposed
CMOS PWM transceiver using self-referenced edge detection. In this
brief, we implemented 3.2-μm on-chip interconnect as a commu-
nication channel and employed 1.6-ps jitter carrier clock to check
the function of the proposed transceiver. By measuring the test
chip fabricated in 65-nm CMOS technology, the proposed technique
found to be feasible under 3.2-μm on-chip interconnect with 1.6-ps
carrier clock. However, more realistic communication channel and
carrier clock signal have to be considered for making the proposed
transceiver practical for commercial applications.
At first, dynamic range for length of communication channel is
analyzed. The dynamic range for jitter measurement is determined
by combination of T and T in Fig. 4. To measure the timing
jitter for guaranteeing BER of less than 10−12 , probability density
function in 3σ of the timing jitter has to be measured [6]. Thus,
2T (jitter measurement range is from −2T to 2T ) must be
designed to be greater than 3σ . The literature of the communication
channel model [13] indicates that the additional jitter is approximately
8 ps when considering 12-in interconnect. Thus, jitter is increased
from 1.6 to 9.6 ps by enlarging interconnect. Since BER of less
than 10−12 requires that T must be greater than approximately
1.5σ , T must be 14.4 ps. In the 2-bit PWM, maximum clock
frequency is determined by inverse of 8T . Therefore, the pro-
posed technique can maintain the data rate of 8.68 Gb/s for guar-
anteeing BER of less than 10−12 even when considering 12-in
interconnect.
Second, the dynamic range for data rate is analyzed. The proposed
technique exploits the fixed T , as shown in Fig. 3. The test
chip embedded T of 15 ps for 2-bit PWM. The maximum clock
frequency is 8.33 GHz, and the maximum data rate is 16.66 Gb/s.
In this brief, we have implemented the fixed T and T to check
the function of the proposed technique. However, if the redundancy
in T and T in Fig. 4 is implemented, the proposed technique can
Fig. 12. Conceptual image of another topology of the proposed CMOS PWM apply to various data rate and communication channel. For example,
transceiver using self-referenced edge detection. when implementing M kinds of the T , M kinds of the data rate can
be feasible.

using both rising and falling edges, another topology utilizes only
rising or falling edges. Since this topology can remove the limitation VIII. C ONCLUSION
of implementation, it expands the application of the proposed A CMOS PWM transceiver circuit using the self-referenced
technique. edge detection technique has been demonstrated for the first time.
Fig. 12 shows the conceptual image of another topology of the By comparing the self-delayed rising edge and modulated falling
proposed CMOS PWM transceiver using self-referenced edge detec- edge, edge detection was realized. This edge detection enables
tion. It is well known that the timing jitter can be expressed as the area-efficient and high-robustness for PWM communication without
accumulation of the period jitter [11]. By applying this characteristic exploiting PLLs. The data-rate optimization and ECC with inter-
to the PWM, PWM edges can be expressed by the accumulation of stage edge detection was introduced. Test chip was fabricated in
the period. For instance, when the previous bit is 0 and the period, 65-nm CMOS. The measured results have demonstrated 2-bit PWM
M−M+, is 00, the current data become 1, as shown in the bottom communication, high data rate (3.2 Gb/s), and high reliability
part of Fig. 12. (BER < 10−12 ) with small area occupation (540 μm2 ).

B. Compatibility With Spread Spectrum Clocking R EFERENCES


Since the proposed technique detects edge within the same clock [1] W.-H. Chen, G.-K. Dehang, J.-W. Chen, and S.-I. Liu, “A CMOS
cycle, it has good compatibility with the spread spectrum clocking 400-Mb/s serial link for AS-memory systems using a PWM
technique [12]. The conventional techniques with PLL or DLL scheme,” IEEE J. Solid-State Circuits, vol. 36, no. 10, pp. 1498–1505,
[1]–[3] have difficulty in recovering clock signal when the both Oct. 2001.
[2] W.-J. Choe, B.-J. Lee, J. Kim, D.-K. Jeong, and G. Kim, “A single-
rising and falling edges are modulated. On the other hand, the pair serial link for mobile displays with clock edge modulation
proposed technique can achieve edge detection because it utilizes scheme,” IEEE J. Solid-State Circuits, vol. 42, no. 9, pp. 2012–2020,
intracycle clock edges. Therefore, the proposed technique has better Sep. 2007.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 6, JUNE 2015 1149

[3] C.-Y. Yang and Y. Lee, “A PWM and PAM signaling hybrid technology [9] A. Sai, Y. Kobayashi, S. Saigusa, O. Watanabe, and T. Itakura,
for serial-link transceivers,” IEEE Trans. Instrum. Meas., vol. 57, no. 5, “A digitally stabilized type-III PLL using ring VCO with 1.01 psrms
pp. 1058–1070, May 2008. integrated jitter in 65 nm CMOS,” in Proc. IEEE Int. Solid-State Circuits
[4] M. Takihi, K. Niitsu, and K. Nakazato, “Charge-conserved Conf., Feb. 2011, pp. 98–100.
analog-to-time converter for a large-scale CMOS bionsensor array,” [10] B. Shen, G. Unruh, M. Lugthart, C.-H. Lee, M. Chambers, and
in Proc. IEEE Int. Symp. Circuits and Syst., Jun. 2014. C. Parrella, “An 8.5 mW, 0.07 mm2 ADPLL in 28 nm CMOS with
[5] M.-T. Chung and C.-C. Hsieh, “A 0.5 V 4.95 μW 11.8 fps PWM CMOS sub-ps resolution TDC and < 230 fs RMS jitter,” in Proc. IEEE Symp.
imager with 82 dB dynamic range and 0.055 % fixed-pattern noise,” in VLSI Circuits, Jun. 2013, pp. 192–193.
Proc. IEEE ISSCC, Feb. 2012, pp. 114–116. [11] M. Ishida et al., “A programmable on-chip picosecond
[6] K. Niitsu, M. Sakurai, N. Harigai, T. J. Yamaguchi, and H. Kobayashi, jitter-measurement circuit without a reference-clock input,” in Proc.
“CMOS circuits to measure timing jitter using a self-referenced clock IEEE ISSCC, Feb. 2005, pp. 512–513.
and a cascaded time difference amplifier with duty-cycle compensation,” [12] D. De Caro, C. A. Romani, N. Petra, A. G. M. Strollo, and C. Parrella,
IEEE J. Solid-State Circuits, vol. 47, no. 11, pp. 2701–2710, Nov. 2012. “A 1.27 GHz, all-digital spread spectrum clock generator/synthesizer
[7] J. A. McNeill, “Jitter in ring oscillators,” IEEE J. Solid-State Circuits, in 65 nm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 5,
vol. 32, no. 6, pp. 870–879, Jun. 1997. pp. 1048–1060, May 2010.
[8] K. Niitsu et al., “A clock jitter reduction circuit using gated phase [13] K.-J. Sham, M. R. Ahmadi, S. B. G. Talbot, and R. Harjani, “FEXT
blending between self-delayed clock edges,” in Proc. IEEE Symp. VLSI crosstalk cancellation for high-speed serial link design,” in Proc. IEEE
Circuits, Jun. 2012, pp. 142–143. Custom Integr. Circuits Conf., Sep. 2006, pp. 405–408.

You might also like