You are on page 1of 11

2560 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO.

11, NOVEMBER 2011

250 Mbps–5 Gbps Wide-Range CDR With Digital


Vernier Phase Shifting and Dual-Mode Control in
0.13 m CMOS
Sang-Yoon Lee, Member, IEEE, Hyung-Rok Lee, Member, IEEE, Young-Ho Kwak,
Woo-Seok Choi, Byoung-Joo Yoo, Daeyun Shim, Chulwoo Kim, Senior Member, IEEE, and
Deog-Kyoon Jeong, Senior Member, IEEE

Abstract—A multi-port serial link with wide-range CDR using A phase-locked loop is a fascinating building block for
digital vernier phase shifting and dual-mode control is presented. clock and data recovery circuits in the asynchronous and the
The proposed vernier phase shifter generates finely-spaced phase plesiochronous clocking architecture due to its properties of
steps and provides unlimited phase rotating with a 13.34-ps phase
step at 5 Gbps. By inherently digital nature, the vernier phase self-oscillation and frequency synthesis. However, since the
shifter enables semi-digital dual-loop CDR with precise tracking PLL-based CDR might lock into harmonic oscillating frequen-
performance, and with the dual-mode control, the proposed cies, it requires additional techniques for initialization and
CDR extends the operating range from 250 Mbps to 5 Gbps and locking into its correct operating frequency. The frequency
achieves a BER of less than 10 at 5 Gbps with 2 1 PRBS. initialization with the aid of the frequency tracking loop [5],
Fabricated in a 0.13- m CMOS process, the main PLL and the
single receiver dissipate 9.0 mW and 19.2 mW respectively at 5
[6] and of the replica VCO [7] have been proposed to maintain
Gbps from a 1.2 V supply. the correct operating frequency locked to received data. The
pull-in range of a PLL in running clock and data recovery
Index Terms—Clock and data recovery, semi-digital dual-loop,
vernier phase shifter, wide-range CDR. should be reduced to prevent drifting into harmonic frequen-
cies, and, hence, it limits the operating range of the PLL. Even
though the low-pass loop characteristic of the PLL rejects the
I. INTRODUCTION high-frequency jitter of the input data, the low loop bandwidth
causes increased VCO-induced jitter accumulation over every

S ERIAL data transmission techniques are widely employed


in chip-to-chip communications. As the system integration
level increases with scaled-down CMOS technologies, com-
clock cycle.
The dual-loop CDR architectures [8]–[12] are widely used for
multi-port CDR applications. Dual-loop CDR architectures are
munication among chips requires moving an ever-increasing
composed of cascading two loops: one is a core loop for clock
amount of data. To meet the growing demand for huge data
generation, and the other is a peripheral loop for clock and data
transmission, integrating several serial links on a single chip is
recovery. Conventionally, the delay-locked loop has been ex-
required in addition to high-speed operation [1], [2]. However,
cluded in clock and data recovery applications because of its
since a large number of serial links must be integrated in a single
limited capture range in spite of its advantages—stable single-
chip, reducing the power consumption of a serial link should
pole loop, no jitter accumulation and design simplicity. Using
be considered. Where the power consumption of a serial link
the clock coming from the core delay-locked loop with phase
is dominated by the dynamic switching, supply regulation tech-
rotator and interpolators, the dual-loop architecture enables the
niques [3], [4] have been reported to minimize the overall power
peripheral digital delay-locked loop to gain an unlimited capture
consumption. Besides supply regulation, the power scales down
range in the clock and data recovery. However, since quantized
as the operating frequency is decreased, the chip’s power con-
phase steps from the phase interpolator determines cycle-to-
sumption can be substantially reduced at low speed when only
cycle jitter in the peripheral loop, achieving finely-spaced phase
a small amount of data transfer is required. To meet the require-
steps is essential. Instead of enhancing the phase resolution in
ments of both high and low speed operations, wide-range CDR
the interpolator, phase averaging [9] and delta-sigma dithering
became an essential part of the serial link communication.
[10] have been proposed to reduce cycle-to-cycle jitter. To make
CDR have a fine resolution at high speed and a wide range,
Manuscript received February 18, 2011; revised May 28, 2011; accepted July the narrow frequency range of the phase interpolator forces the
25, 2011. Date of publication September 01, 2011; date of current version Oc-
tober 26, 2011. This paper was approved by Guest Editor Muneo Fukaishi.
dual-loop architectures to have multiple interpolating stages or
S.-Y. Lee, H.-R. Lee, W.-S. Choi, B.-J. Yoo, and D.-K. Jeong are with the a large number of multi-phase clocks which increase power
Department of Electrical Engineering, Seoul National University, Gwanak-gu, consumption and limit high-speed operation as well. Instead
Seoul, Korea (e-mail: dkjeong@snu.ac.kr).
Y.-H. Kwak and C. Kim are with the Department of Electronics and
of using a phase interpolator to achieve fine phase resolution,
Computer Engineering, Korea University, Anam-dong, Sungbuk-gu, 136-713, Vernier Oversampling and Alignment (VOSA) is presented in
Korea. [13]. But VOSA needs multiple DLLs and still requires addi-
D. Shim is with Silicon Image, Sunnyvale, CA 94085 USA.
Color versions of one or more of the figures in this paper are available online
tional circuits for a wide operating range.
at http://ieeexplore.ieee.org. In this paper, instead of using a phase interpolator or VOSA,
Digital Object Identifier 10.1109/JSSC.2011.2164032 the Vernier Phase Shifter (VPS) based dual-loop CDR is pre-

0018-9200/$26.00 © 2011 IEEE


LEE et al.: 250 Mbps–5 Gbps WIDE-RANGE CDR WITH DIGITAL VERNIER PHASE SHIFTING AND DUAL-MODE CONTROL IN 0.13 m CMOS 2561

Fig. 2. Block diagram of proposed clock and data recovery circuit.

Fig. 1. Multi-port transceiver architecture with mesochronous clocking.


analog loop is made up of a multi-phase PLL and DLL in the
Vernier Phase Shifter (VPS), and the digital loop is composed
sented. The proposed VPS that consists of a conventional DLL of a VPS, a multi-phase generator, samplers, a digital loop filter,
and simple control logic provides unlimited phase rotating and and control logics. The VPS, which is controlled by a digital
finely-spaced phase steps. Taking advantage of the dual-mode signal, generates clock phases whose minimum phase step is
control in the VPS, the operating range of the implemented CDR 1/60 TCK in the presented circuit. The clock phase rotates ac-
can be doubled without increasing the number of delay cells that cording to control signals to track an eye center. The multi-phase
limit high-speed operation. In addition, due to the inherent dig- generator implemented with a PLL provides 8-phase clocks for
ital nature of the VPS, the semi-digital architecture minimizes the samplers to obtain edge information and to retime the data.
the area of the loop filter and enhances the portability. The digital filter receives 4 UP/DN signals from the samplers
The rest of the paper is organized as follows. In Section II, and then decides whether the selected phase should be shifted
the architecture of the proposed VPS-CDR is described. or not. After deciding the direction of the phase shifting, the con-
The detailed operation of the VPS and dual mode controls for trol logic generates MUX control signals. When the MUXs in
wide-range operation is also presented. Section III describes the the VPS switch their clock inputs, the control logic also gener-
building blocks of the proposed CDR in depth. Experimental ates a PD/PFD disable signal for DLL/PLL respectively to pre-
results are discussed in Section IV, followed by conclusions in vent a clock glitch from propagating through the DLL/PLL.
Section V.
A. Vernier Phase Shifter
A DLL can generate multi-phase clocks with a given refer-
II. ARCHITECTURE
ence clock frequency. However, to generate evenly-spaced
Fig. 1 shows the block diagram of the implemented multi-port phases, a Voltage-Controlled Delay Line (VCDL) needs
transceiver architecture. The transmitters and receivers in each delay cells which will increase power consumption linearly.
chip use the same reference clock to communicate with each Employing a Phase Interpolator (PI) [8], the DLL can provide
other, known as the mesochronous clocking [14]. Synchronizing phases by weighting two selected phases (here, is
the clock source, each transceiver operates at exactly the same the resolution of a PI.). However, non-idealities in PI result in
frequency even when the reference clock suffers from noise. the nonlinear phase steps, increasing output jitter in clock and
However, the reference clock generally propagates through a data recovery circuits.
clock synthesizer and clock tree, and the resulting offset delay The vernier scale, which is used in measuring instruments,
between the clock source and each transceiver limits the timing allows us to make measurements in higher precision than
margin to recover the data. a uniformly-divided scale, by using two different fractional
In the implemented multi-port architecture, each transceiver spaces—generally divided by two relatively prime integers—of
shares a single main PLL for clock synthesis to minimize the a fixed common unit. The multi-phase generator based on
overall power consumption and die area, and each receiver the vernier delay line [15], [16] can generate high-resolution
tracks the phase offset between the clock source and the (i.e., finely-spaced) multi-phase clocks without using any
incoming data. The main PLL provides quadrature clocks phase interpolators. If evenly-spaced multi-phases and
TCK 0:3 to each transceiver at a quarter rate to mitigate evenly-spaced multi-phases with the same clock frequency
the design complexity and to reduce power consumption. are blended, the resulting possible output phases can be
Quarter-rate clocks are desired especially when a large number expressed as
of transceivers increase the output load of a clock synthesizer.
Fig. 2 illustrates the block diagram of the proposed clock and (1)
data recovery circuit. The proposed CDR consists of three major
loops: two loops are an analog loop to generate the clock and the where represents the set of integers. It is well known that
other is a digital loop to compensate for the offset delay between and are
a clock and data path for proper clock and data recovery. The equivalent sets ( means the greatest common divisor). Thus,
2562 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011

Fig. 4. Possible phase sets of VPS in 1-cycle locking mode.

Fig. 3. Operation of vernier phase shifter during 1-cycle locking mode:


(a) phase leading and (b) phase lagging.
paper. In order to implement this, the MUX4 selects the next
TCK clock and the MUX15 selects the previous fourth MCK
the minimum unit, or phase step, that the vernier delay line can clock as shown in Fig. 3(a). This sequential operation shifts 1/60
generate is phase of the previous VPS output clock ahead. On the other
hand, to delay the VPS phase by the minimum phase step, (3)
can be satisfied with the condition and
(2) , and, thus,

(5)
Utilizing a vernier multi-phase generator and simple control
logics, the proposed VPS [17] advances or delays the phase of an
input clock by a finer phase step and provides unlimited phase In contrast to the previous case, the MUX4 selects the pre-
rotating. The minimum step of phase shifting can be acquired vious TCK clock and the MUX15 selects the next 4th MCK
when the next two selected multi-phase clocks, and , clock as shown in Fig. 3(b). For example, if the current MUX
satisfy state is located in , then the next position
for phase leading will be and for phase
lagging . Both phases are exactly 1/60
TCK apart from the initial phase. In addition, similar to the in-
(3) terpolating DLL, the VPS-DLL allows unlimited phase rotation
by selecting DLL output without delaying input clock.
To exemplify how it works, the operation of the proposed
VPS is shown in Fig. 3. It is composed of only a conventional B. Wide-Range Operation
DLL and two clock MUXs for the precise phase generation In a delay-locked loop, since it is hard to extract the frequency
and rotation. The MUX4 selects one of 4 multi-phase clocks information from a reference clock, harmonic and false locking
TCK 0:3 coming from the main PLL . In order to limits the operating frequency range of the DLL even though the
get a 1/15 UI, since the main PLL provides four multi-phase delay range of VCDL is wide. Several previous works [19]–[22]
clocks, 15 multi-phase clocks are supplied from the DLL output have been proposed to overcome this limited locking range of a
. Then, since 4 and 15 are relatively prime, the VPS DLL. Replica delay-line [19] and a digital DLL [20], [21] can
can synthesize 60 evenly-spaced phases. Fig. 4 shows the 60 extend the locking range to entire delay cell range. However, the
possible phase sets of the VPS. delay range of a VCDL limits the operating range and the large
To advance the phase by the minimum phase step, the condi- number of delay cells should be required to extend the range.
tion (3) should be satisfied. In our design, The wide-range DLL suffers from a harmonic locking problem
and , and, therefore, and the large number of delay cells limits high speed operation
. Thus, of the DLL. Thus, it is not easy to make a fast DLL and achieve
a wide range simultaneously.
(4) Instead of increasing the number of delay cells, the proposed
VPS-DLL introduces two locking modes: a high-speed mode
In fact, and and a low-speed mode. In the high-speed mode, the DLL locks
may seem to be just one kind of the solutions to the equation, into one cycle of TCK just as a typical DLL operation. Let us
, but they are the general assume that the minimum and maximum delay of a single delay
solution of the equation [18]: i.e., there is no other way to shift cell is and respectively. The VPS-DLL has the
the phase by the minimum step other than that described in this operating period range between and
LEE et al.: 250 Mbps–5 Gbps WIDE-RANGE CDR WITH DIGITAL VERNIER PHASE SHIFTING AND DUAL-MODE CONTROL IN 0.13 m CMOS 2563

Fig. 5. Possible phase sets of the VPS in 0.5-cycle locking mode.


Fig. 6. Operation of VPS during 0.5-cycle locking mode: (a) phase leading and
(b) phase lagging.
because the presented VPS consists of 15 delay cells to gen-
erate 15 multi-phases. Since the VCDL of this VPS still has a
wide operating range, the DLL needs a harmonic lock detector In other words, to shift the minimum phase step ahead or behind,
to prevent itself from locking into multiple harmonic delays. the selection of the MUX4 should alternate between two TCK
In the low-speed mode, since the TCK period still remains (the selection alternated between four TCK in the high-speed
at 4 UI for the incoming data, if the DLL locks to a half mode), and the MUX15 should select the previous or the next
cycle of TCK, then the span between the multi-phase clocks 8th MCK clock. In this mode, however, the multi-phases of
MCK 0:14 becomes half of that of the high-speed mode each does not cover the whole 4 UI but only 2 UI.
. In that case, the possible output phases The other half is covered by the multi-phases of ,
and the minimum phase step of the VPS-DLL can be expressed which can be easily understood by Fig. 5. Hence, according to
in our design as the location of rising edge we want, two out of four TCK are
selected, and the selection of the MUX4 alternates between the
(6) two for each phase shift. The control signal can be easily gener-
and ated by detecting the overflow of the MUX15 control state. The
shifting operation of the VPS in this low-speed mode is graphi-
(7) cally shown in Fig. 6.
Since we have four multi-phase clocks, we can further ex-
In our case, since is a even number, becomes pand our idea to when the DLL locks into a quarter cycle of
2, and 2 is factored out in (7); thus we still obtain (60) TCK although this extended low-speed mode was not imple-
evenly-spaced phases in 4 UI, just like in the high-speed mode, mented in the prototype chip. In this case, the span between the
and the minimum phase shift amount remains at 1/60 TCK. multi-phase clocks MCK 0:14 becomes quarter of that of the
Fig. 5 shows 60 possible phase sets in the low-speed mode. With high-speed mode. Each TCK clock generates 15 evenly-spaced
and , we can get 30 evenly-spaced phases. phases in a quarter cycle of TCK and, considering all combina-
And with and , another unique 30 evenly- tion of TCK 0:3 , the VPS obtains still 60 evenly-spaced phases
spaced phases are generated. Utilizing half-cycle locking, the in a TCK cycle. This locking mode extends the operating delay
operating delay range of the VPS can be extended to range to with the same VCDL.
even with the same VCDL. Fig. 7 shows 60 possible phase sets in this extended low-speed
To get the minimum phase shifting in the low-speed mode, mode. To lead or lag the phase in this extended low-speed mode,
clock selection should satisfy the condition below: MUX15 selects previous or next MCK clock and MUX4 selects
previous or next TCK with overflow or underflow of MUX15
control signal as shown in Fig. 8.
To support dual-mode operation, the lock range of the har-
(8) monic lock detector should be different in each mode. Fig. 9
shows the detection range of the harmonic lock detector. To sim-
The way of controlling the MUX inputs for minimum phase- plify design, the VCDL is forced to have an initial condition
step shifting is similar to those of the high-speed mode. Fol- with minimum delay by resetting the loop filter and afterwards
lowing the similar analysis made before, the harmonic lock detector compares multiphase output clock
and . Therefore, with reference clock. The range of phase ‘UP’ and ‘DN’ signals
. Thus, is derived from dynamic PD detection range. Ranges of ‘FUP’
and ‘FDN’ signals should be partially-overlapped with the range
(9)
of PD signals to guarantee design margins. Since ‘FUP’ and
2564 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011

Fig. 9. Harmonic lock detection range: (a) 1-cycle locking mode and (b) 0.5-
cycle locking mode.

Fig. 7. Possible phase sets of VPS in 0.25-cycle locking mode.

Fig. 10. Simplified -domain model of VPS-CDR loop.

the multi-phase PLL is . The overall CDR loop consists of


cascading two delay-locked loops which are an analog DLL for
clock generation and a digital DLL for clock and data recovery.
But, two delay-locked loops should be modeled with two dif-
ferent manners [23]. In the VPS-DLL, the input reference clock
is compared with the delayed clock of itself, known as type-I
Fig. 8. Operation of VPS during 0.25-cycle mode: (a) phase leading and (b) DLL. Accounting the correlation between phases of two input
phase lagging. clocks, the transfer function of the phase of the VPS-DLL output
with respect to that of the input clock can be
expressed as
‘FDN’ signals are generated by comparing input clock of the
DLL with multiphase clocks, the values of range boundary are
derived from multiphase clocks. During the high-speed mode,
the harmonic lock detector forces phase lagging below 0.68 (10)
cycle of TCK by comparing the falling edge of input clock and
(since ) and forces phase leading where is the period of the input clock.
above 1.25 cycles of TCK. Unlike a typical DLL operation, the This function exhibits an all-pass filter with small high-fre-
low-speed mode does not require any false lock detection since quencies boosting. But, in the digital clock and data recovery
the DLL locks into half a TCK cycle and the harmonic lock de- loop, the phase of the VPS-DLL output clock is com-
tector forces phase leading over a 0.83 cycle of TCK. pared with that of the received data which is uncorrelated with
, known as type-II DLL. The transfer function from the
C. Clock and Data Recovery Loop to the selected phase of VPS output , exhibits a
low-pass filter and can be expressed as
Even though the loop of VPS-CDR should be considered a
triple-loop due to the multi-phase PLL [3], if we assume the loop
of the PLL is a low-pass filter and its bandwidth is sufficiently (11)
higher than the clock and data recovery loop, the dynamics of
the VPS-CDR loops is similar to that of the dual-loop archi- Considering (10) and (11) in the plesiochronous CDR system,
tecture. Fig. 10 shows the simplified -domain model of VPS- since the clock jitter is uncorrelated with that of data
CDR. Before discussing the loop dynamics, let us assume that , and assuming the dominant pole of the PLL is much
each delay-locked loop has one pole for analog higher than digital loop, the transfer function from the
loop and digital loop respectively, and the transfer function of to the phase error between sampling clock and data , and
LEE et al.: 250 Mbps–5 Gbps WIDE-RANGE CDR WITH DIGITAL VERNIER PHASE SHIFTING AND DUAL-MODE CONTROL IN 0.13 m CMOS 2565

Fig. 11. Block diagram of VPS-DLL.

from to exhibits a high-pass filter and can be ex- through the VCDL for one cycle in the high-speed mode and
pressed as half cycle in the low-speed mode. To achieve wide delay-range
and evenly-spaced multi-phase generation, the delay cell em-
ploys the differential delay cell with latched load since the phase
unevenness in a chain of delay cell causes large cycle-to-cycle
jitter due to the nonlinear phase steps. To help implement the
(12) harmonic lock detector simple, a frequency lock detector in the
main PLL resets a loop filter of the DLL, making the initial
VCDL delay the shortest.
(13) Since the PLL can track the frequency of the reference clock,
even though the operating range of the VCDL in the DLL is
limited due to the harmonic locking problem, the PLL can ex-
These transfer functions show that the wide bandwidth of the
tend the operating range to the VCO range. The implemented
digital loop will track the jitter of reference clock and data better.
PLL achieves the 20 operating range with the supply-regu-
However, loop stability should be considered when the pole of
lated VCO [24]. Since the VPS generates abrupt phase jumps
the digital loop is close to that of the PLL.
during phase shifting, the multi-phase PLL filters out the phase
In a source-synchronous or a mesochronous system, consid-
jumps to reduce the recovered clock jitter as in [9].
ering the reference clock-induced jitter of incoming data, such
correlated jitter between the reference clock and incoming B. Quarter-Rate Phase Detector
data is cancelled out at the phase detector in the digital loop
due to all-pass characteristic of the analog loop, and the digital The quarter-rate phase detector is composed of 8 samplers
loop will compensate for the static phase offset between data which generate binary phase information as shown in Fig. 12.
and clock path. In addition, such configuration enables the low A binary PD is suitable for digital data recovery loop due to
bandwidth digital loop to mitigate the trade-offs between the simple implementation, high speed operation and inherent data
digital loop bandwidth and the loop stability. recovery. Odd order samplers capture the phase information at
the edge of the incoming data and even order samplers recover
III. BUILDING BLOCKS the data at the eye center of data. In order to decide whether
sampling clock is early or late, the phase detector makes a de-
A. Vernier Phase Shifting DLL & Multiphase PLL cision with XOR result of adjacent two samplers. Generated 4
Fig. 11 shows the block diagram of the vernier phase shifting UP/DN signals, which give complete phase information of 4 UI,
DLL. In order to lock into the different cycle of TCK in each are delivered to the digital filter.
mode, two MUX4 are placed in front of a phase detector and
VCDL separately with different control signal. To achieve accu- C. Digital Loop Filter & Control Logic
rate phase locking, a dynamic phase detector [19] is employed. Based-on sampler outputs and its previous data, the digital
During phase shifting, the output signal and of the low-pass filter decides whether the clock and data recovery loop
PD should be disabled since switched input clock propagates needs to shift the phase, and it generates final decision signal
2566 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011

Fig. 14. Block diagram of control logic.

Fig. 12. Quarter-rate phase detector.

Fig. 13. Block diagram of digital filter.

for control logic. Since the digital loop can be considered a


delay-locked loop, the loop filter can be implemented with a
simple accumulator. Fig. 13 shows the block diagram of the dig-
ital filter. For simplifying design and minimizing die area, the
digital filter consists of a first-order infinite impulse response
filter and a quantizer.
Fig. 14 shows the block diagram of the control logic and
the control signals going into two MUXs and DLL/PLL when
UP/DN phase shifting is activated. The control logic consists of
a PD controller and 4-bit shift register, 4-bit counter for con-
trolling MUX4, MUX15 respectively. Since the clock traveling
time inside the DLL is one clock cycle for the high-speed mode
and a half cycle for the low-speed mode, if we do not block
the PFD of the PLL during phase transitions, then the PLL will
try to track the incoming clock glitch. To make sure that CDR
does not suffer from large jitter due to clock glitches, the control
logic provides a UP/DN disable signal during MUX switching.
When the UP/DN disable signal is applied, the UP/DN signal Fig. 15. Block diagram of (a) MUX4 controller and (b) MUX15 controller.
is blocked, making the VCO temporarily run freely maintaining
its current phase location.
Fig. 15 shows the block diagram of the MUX controller and or underflow (UF) signal of ADDER in the MUX15 controller
the output table according to control bits. A 4-bit shift register in Fig. 15(b). To select or during
is employed for rotating MUX4 control signal S1 as shown in this mode, shifted bit sequence are selected from the MUX15
Fig. 15(a). According to UP or DN signal, the shift register shifts signal.
the bit sequence to forward or backward direction. During the For design simplicity, since the MUX15 switching signal step
low-speed mode, rotating signal is activated by overflow (OF) is feasible with binary number operation, a 4-bit counter and
LEE et al.: 250 Mbps–5 Gbps WIDE-RANGE CDR WITH DIGITAL VERNIER PHASE SHIFTING AND DUAL-MODE CONTROL IN 0.13 m CMOS 2567

Fig. 16. Chip microphotograph.

shifting output bit is suitable for MUX15 control as shown in


Fig. 15(b). MUX15 control for the minimum phase shifting can
be generalized

(14)

For example, during the high-speed mode, the bit sequence


of a 4-bit counter output bus is
for adding or subtracting 4 from the previous data. In addition,
during the low-speed mode, the bit sequence is just changed to
for adding or subtracting 8 from
the previous data with the same 4-bit counter. For fast logic op-
eration, MUX15 control signal is selected from two anticipated
values by two 4-bit adders, similar to a carry-select adder. Each
adder generates overflow and underflow signals for the MUX4
control.

IV. EXPERIMENTAL RESULTS


The proposed multi-port transceiver was fabricated in a
0.13- m CMOS process technology. The prototype chip con-
tains a main PLL and two transmitter and receiver pairs. Fig. 16
shows the microphotograph of the implemented chip. The
Fig. 17. Jitter histogram of recovered clock for 5 Gbps data: (a) fixed mode,
transceiver, which includes a main PLL and two transceivers, (b) sync pattern input, and (c) PRBS 2 1 input.
occupies 0.75 0.97 mm and the single receiver core occupies
0.75 0.27 mm .
Fig. 17 shows the jitter histogram of the recovered clock for PRBS. According as which one of the twin peaks is closer to
the 5-Gbps data. The VPS-CDR was operated with a 1.2-V the center of input data, the density of twin peaks changes. The
supply at room temperature. To evaluate the clock jitter of measured RMS jitter is 5.83 ps and peak-to-peak jitter is
CDR, the digital loop of CDR is forced fixed without tracking 52.22 ps . The prototype CDR circuit achieves the bit error
the incoming data as shown in Fig. 17(a). The recovered clock rate (BER) of less than 10 with the 2 1 PRBS input at
shows a single peak whose root-mean-square jitter is 2.85 ps 5 Gbps. Fig. 18 shows the jitter tolerance curve from 5 Gbps
and peak-to-peak jitter is 21.09 ps . When CDR is in tracking data. The sinusoidal jitter with various amplitude and frequency
the data, the recovered clock will alternate between adjacent is added to 5 Gbps data with 2 1 PRBS pattern and the shared
data center with the phase step of VPS. To measure the phase reference clock also has same jitter amplitude and frequency
step of VPS, CDR tracks the input of sync pattern as shown in conditions.
Fig. 17(b). The jitter histogram shows twin peaks whose span The main PLL including the clock driver consumes 9.0 mW
is around 13.34 ps, which is exactly 1/15 UI of the 5-Gbps data, and the single receiver dissipates 19.2 mW while all blocks are
and the recovered clock jitter is 6.89 ps and 51.67 ps . running at 5 Gbps from a 1.2-V supply. At 250 Mbps, the main
Fig. 17(c) shows the jitter histogram of 5-Gbps input with 2 1 PLL including the clock driver and the single receiver dissipate
2568 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011

Fig. 18. Jitter tolerance at 5 Gbps with PRBS 2 1.

TABLE I
PERFORMANCE SUMMARY

Fig. 19. Power consumption: (a) high-speed mode and (b) low-speed mode.
2.72 mW and 2.11 mW respectively. Fig. 19 shows the power
consumption of the main PLL and the single receiver according
the operating range over that given in this paper. The proto-
to data-rate in the high-speed mode (a) and the low-speed mode
type chip was fabricated in a 0.13- m CMOS process and gives
(b). In each mode, the power consumption of the receiver is pro-
good data tracking performance with jittered input data. The im-
portional to the operating frequency. Since the main difference
plemented VPS-CDR circuit tracks input data with a minimum
between two modes is phase locking point in a DLL, power con-
13.34-ps phase step at 5 Gbps and achieves an operating range
sumption according to the operating frequency scales continu-
from 250 Mbps to 5 Gbps with a BER of less than 10 . The
ously considering two modes. Table I summarizes the perfor-
measured results show that the RMS and peak-to-peak jitter of
mance of the proposed wide-range clock and data recovery cir-
recovered clock are 5.83 ps and 52.22 ps respectively from
cuit.
5 Gbps data with 2 1 PRBS. The implemented main PLL and
single receiver dissipate 2.7 mW and 2.1 mW respectively at
V. CONCLUSION 250 Mbps and dissipate 9.0 mW and 19.2 mW respectively at
This paper has presented a wide-range vernier phase shifting 5 Gbps.
CDR circuit for the multi-port transceiver architecture. The pre-
sented vernier phase shifter generates the finely-spaced phase REFERENCES
steps and provides unlimited phase rotating using a conven- [1] R. Farjad-Rad, A. Nguyen, J. M. Tran, T. Greer, J. Poulton, W. J.
Dally, J. H. Edmondson, R. Senthinathan, R. Rathi, M.-J. E. Lee, and
tional delay-locked loop. By its inherently digital nature, the H.-T. Ng, “A 33-mW 8-Gb/s CMOS clock multiplier and CDR for
VPS enables semi-digital clock and data recovery with pre- highly integrated I/Os,” IEEE J. Solid-State Circuits, vol. 39, no. 9,
cise tracking performance, and with the dual mode control, pre- pp. 1553–1561, Sep. 2004.
[2] A. L. Coban, M. H. Koroglu, and K. A. Ahmed, “A 2.5–3.125-Gb/s
sented CDR extends the operating range from 250 Mbps to quad transceiver with second-order analog DLL-based CDRs,” IEEE
5 Gbps with the same VCDL and gives possibilities to extend J. Solid-State Circuits, vol. 40, no. 9, pp. 1940–1947, Sep. 2005.
LEE et al.: 250 Mbps–5 Gbps WIDE-RANGE CDR WITH DIGITAL VERNIER PHASE SHIFTING AND DUAL-MODE CONTROL IN 0.13 m CMOS 2569

[3] J. Kim and M. A. Horowitz, “Adaptive supply serial links with sub-1-V [24] S. Sidiropoulos, D. Liu, J. Kim, G. Wei, and M. Horowitz, “Adaptive
operation and per-pin clock recovery,” IEEE J. Solid-State Circuits, bandwidth DLLs and PLLs using regulated supply CMOS buffers,” in
vol. 37, no. 11, pp. 1403–1413, Nov. 2002. IEEE Symp. VLSI Circuits Dig. Tech. Papers, 2000, pp. 124–127.
[4] G. Y. Wei, J. Kim, D. Liu, S. Sidiropoulos, and M. A. Horowitz, “A
variable-frequency parallel I/O interface with adaptive power-supply
regulation,” IEEE J. Solid-State Circuits, vol. 35, no. 11, pp.
1600–1610, Nov. 2000.
Sang-Yoon Lee (S’00–M’11) received the B.S.
[5] K. M. W, H. S. Lee, and C. G. Sodini, “A 200-MHz CMOS phase-
degree in electrical engineering from Korea Univer-
locked loop with dual phase detectors,” IEEE J. Solid-State Circuits,
sity, Seoul, Korea, in 2004, and the M.S. degree in
vol. 24, no. 6, pp. 1560–1568, Dec. 1989.
electrical engineering from Seoul National Univer-
[6] H. Song, D. S. Kim, D. H. Oh, S. Kim, and D. K. Jeong, “A
sity, Seoul, Korea, in 2006. He is currently working
1.0–4.0-Gb/s all-digital CDR with 1.0-ps period resolution DCO and
toward the Ph.D. degree in electrical engineering at
adaptive proportional gain control,” IEEE J. Solid-State Circuits, vol.
Seoul National University.
46, no. 2, pp. 424–434, Feb. 2011.
His research interests include high-speed serial
[7] R. J. Baumert, P. C. Metz, M. E. Pedersen, R. L. Pritchett, and J. A.
links and low-power delta-sigma modulators.
Young, “A monolithic 50–200 MHz CMOS clock recovery and re-
timing circuit,” in Proc. IEEE CICC, 1989, pp. 14.5.1–14.5.4.
[8] S. Sidiropoulos and M. A. Horowitz, “A semidigital dual delay-locked
loop,” IEEE J. Solid-State Circuits, vol. 32, no. 11, pp. 1683–1692,
Nov. 1997.
[9] P. Larsson, “A 2–1600-MHz CMOS clock recovery PLL with Hyung-Rok Lee (S’99–M’06) received the B.S.,
low-Vdd capability,” IEEE J. Solid-State Circuits, vol. 34, no. 12, pp. M.S., and Ph.D. degrees in electrical engineering
1951–1960, Dec. 1999. from Seoul National University, Seoul, Korea, in
[10] P. K. Hanumolu, G. Y. Wei, and U. K. Moon, “A wide-tracking range 1998, 2000, and 2006, respectively.
clock and data recovery circuit,” J. Solid-State Circuits, vol. 43, no. 2, From 2006 to 2011, he was with Silicon Image
pp. 425–439, Feb. 2008. Inc., Sunnyvale, CA. He is currently with Seoul Na-
[11] S. Kim, D. Lee, Y. S. Park, Y. Moon, and D. Shim, “A dual PFD phase tional University. His research interests include high-
rotating multi-phase PLL for 5 Gbps PCI express Gen2 multi-lane se- speed I/O and PLL/DLL design for high-speed com-
rial link receiver in 0.13um CMOS,” in IEEE Symp. VLSI Circuits Dig. munication.
Tech. Papers, 2007, pp. 234–235.
[12] T. Toifl, C. Menolfi, P. Buchmann, M. Kossel, T. Morf, R. Reute-
mann, M. Ruegg, M. L. Schmatz, and J. Weiss, “A 0.94-ps-RMS-jitter
0.016-mm 2.5-GHz multiphase generator PLL with 360 digitally
programmable phase shift for 10-Gb/s serial links,” IEEE J. Solid-State
Circuits, vol. 40, no. 12, pp. 2700–2712, Dec. 2005. Youngho Kwak received the B.S. degree in elec-
[13] K. Omote, K. Shimizu, and J. Okamura, “A vernier over sampling and tronics engineering from Korea University, Seoul,
alignment technique for Gb/s serial communication,” in Proc. IEEE Korea, in 2004, where he is currently working
Asian Solid-State Circuits Conf., 2005, pp. 29–32. toward the Ph.D. degree in electronics and computer
[14] W. J. Dally and J. W. Poulton, Digital Systems Engineering. New engineering.
York: Cambridge Univ. Press, 1998. His research interests are in clock and data re-
[15] S. C. Lin and T. C. Lee, “An 833-MHz 132-phase multiphase clock covery for high-speed communication, high-speed
generator with self-calibration circuits,” in Proc. IEEE Asian Solid- serial links, and PLL design.
State Circuits Conf., 2008, pp. 437–440.
[16] J. Christiansen, “An integrated high resolution CMOS timing generator
based on an array of delay locked loops,” IEEE J. Solid-State Circuits,
vol. 31, no. 7, pp. 952–957, Jul. 1996.
[17] S. Y. Lee, H. R. Lee, Y. H. Kwak, B. J. Yoo, D. Shim, C. Kim, and
D. K. Jeong, “250 Mbps–5 Gbps wide-range CDR with digital vernier
phase shifting and dual mode control in 0.13 m CMOS,” in Proc. Woo-Seok Choi was born in Korea. He received the
IEEE Asian Solid-State Circuits Conf., 2010, pp. 185–188. B.S. and M.S. degrees in electrical engineering and
[18] G. E. Andrews, Number Theory. New York: Dover Publications, computer science from Seoul National University,
1994. Seoul, Korea, in 2008 and 2010, respectively.
[19] Y. Moon, J. Choi, K. Lee, D. K. Jeong, and M. K. Kim, “An all-analog At present, he works as a research engineer at
multiphase delay-locked loop using a replica delay line for wide-range Seoul National University. He will be pursuing the
operation and low-jitter performance,” IEEE J. Solid-State Circuits, Ph.D. degree at Oregon State University beginning
vol. 35, no. 3, pp. 377–384, Mar. 2000. September 2011. His current research interests
[20] T. H. Lee, K. S. Donnelly, J. T. C. Ho, J. Zerbe, M. G. Johnson, and include low-power delta-sigma ADCs and capacitive
T. Ishikawa, “A 2.5 V CMOS delay-locked loop for an 18 Mbit, 500 sensor interface circuits.
Megabyte/s DRAM,” IEEE J. Solid-State Circuits, vol. 29, no. 12, pp.
1491–1496, Dec. 1994.
[21] A. Efendovich, Y. Afek, C. Sella, and Z. Bikowsky, “Multifrequency
zero-jitter delay-locked loop,” IEEE J. Solid-State Circuits, vol. 29, no.
1, pp. 67–70, Jan. 1994. Byoung-Joo Yoo received the B.S. degree in elec-
[22] B. W. Garlepp, K. S. Donnelly, J. Kim, P. S. Chau, J. L. Zerbe, C. trical engineering from Korea University, Seoul,
Huang, C. V. Tran, C. L. Portmann, D. Stark, Y. F. Chan, T. H. Lee, Korea, in 2005, and the M.S. degree from Seoul
and M. A. Horowitz, “A portable digital DLL for high-speed CMOS National University, Seoul, Korea, in 2007, where
interface circuits,” J. Solid-State Circuits, vol. 34, no. 5, pp. 632–644, he is currently working toward the Ph.D. degree.
May 1999. His research interests are in the field of RF and
[23] M. J. Edward Lee, W. J. Dally, T. Greer, H. T. Ng, R. Farjad-Rad, J. analog integrated circuit design for high-speed serial
Poulton, and R. Senthinathan, “Jitter transfer characteristics of delay- link applications.
locked loops—Theories and design techniques,” IEEE J. Solid-State
Circuits, vol. 38, no. 4, pp. 614–621, Apr. 2003.
2570 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011

Daeyun Shim was born in Seoul, Korea, in 1962. He Dr. Kim has received the Samsung HumanTech Thesis Contest Bronze Award
received the B.S., M.S., and Ph.D. degrees in elec- (1996), the ISLPED Low-Power Design Contest Award (2001), the DAC Stu-
tronics engineering from Seoul National University, dent Design Contest Award (2002), SRC Inventor Recognition Awards (2002),
Seoul, Korea, in 1985, 1987, and 2000, respectively. the Young Scientist Award from the Ministry of Science and Technology of
From 1987 to 1994, he worked at Samsung Korea (2003), the Seoktop Award for excellence in teaching (2006), and the
Electronics Corporation, Kyung-Ki-Do, Korea. ASP-DAC Best Design Award (2008). He is currently on the editorial board of
His research interests were algorithm and imple- IEEE TRANSACTIONS ON VLSI SYSTEMS.
mentation on video signal processing including
compression/decompression, high-speed digital
circuit design, high-speed memory architecture, and
high-speed locking systems. In 2001 after getting Deog-Kyoon Jeong (S’85–M’89–SM’09) received
the Ph.D., he joined Silicon Image Inc., Sunnyvale, CA, where his work is the B.S. and M.S. degrees in electronics engineering
mostly focused on architecture and implementation of high-speed serial links from Seoul National University, Seoul, Korea, in
like PCIe, SATA, HDMI/MHL, and SPMT in memory applications. 1981 and 1984, respectively, and the Ph.D. degree in
electrical engineering and computer sciences from
the University of California, Berkeley, in 1989.
From 1989 to 1991, he was with Texas Instru-
Chulwoo Kim (S’98–M’02–SM’06) received the ments, Dallas, TX, as a Member of the Technical
B.S. and M.S. degrees in electronics engineering Staff and worked on the modeling and design of
from Korea University, Seoul, Korea, in 1994 and BiCMOS gates and the single-chip implementation
1996, respectively, and the Ph.D. degree in electrical of the SPARC architecture. Then he joined the
and computer engineering from the University of faculty of the Department of Electronics Engineering and Inter-University
Illinois at Urbana-Champaign in 2001. Semiconductor Research Center, Seoul National University, where he is cur-
In 1999, he worked as a summer intern at Design rently a Professor. He has published more than 60 technical papers and holds 52
Technology at Intel Corporation, Santa Clara, CA. In U.S. patents. He is one of the co-founders of Silicon Image, which specializes
May 2001, he joined IBM Microelectronics Division, in digital interface circuits for video displays such as DVI and HDMI. His main
Austin, TX, where he was involved in Cell processor research interests include the design of high-speed I/O circuits, phase-locked
design. Since September 2002, he has been with the loops, and network switch architectures.
Department of Electronics and Computer Engineering, Korea University, where Dr. Jeong was one of recipients of the ISSCC Takuo Sugano Award in 2005
he is currently an Associate Professor. In 2008–2009, he was a Visiting Scholar for Outstanding Far-East Paper.
at the University of California, Los Angeles. His current research interests are in
the areas of wireline transceivers, memory, power management, and data con-
verters.

You might also like