You are on page 1of 4

A 40 GS/s 4 bit SiGe BiCMOS Flash ADC

Xuan-Quang Du, Markus Grözing, Matthias Buck and Manfred Berroth


Institute of Electrical and Optical Communications Engineering (INT)
University of Stuttgart, Germany

Abstract—This paper presents the design and experimental Section II describes the proposed ADC architecture,
test of a 40 GS/s 4 bit single-core flash ADC in a 0.13 µm SiGe Section III the circuit components and Section IV the
BiCMOS technology. The ADC exploits a traveling-wave concept measurement results. Section V concludes the presented
and integrates a new low-complexity Pseudo-XOR gray encoder results with a comparison to the state of the art.
that makes use of folded-cascode differential logic. Up to a
sampling rate of 39.04 GS/s the ADC provides a measured ENOB II. ARCHITECTURE
of more than 3 bits and a SFDR of more than 24.8 dBc within the
frequency band from DC to 20 GHz. At 40.32 GS/s the frequency The proposed ADC architecture is briefly introduced in [8]
band for a minimum effective resolution of 3 bits is 12 GHz and and is shown in Fig. 1(a). The main building blocks are a
at 42.24 GS/s it is about 5.3 GHz. unary flash ADC core and a bubble-error suppressing gray
Index Terms—Flash analog-to-digital converter (ADC), SiGe encoder. To omit the need for a front-end track-and-hold
BiCMOS, Pseudo-XOR gray encoder, data scrambling, field amplifier with fast settling time, the flash core exploits a
programmable gate array (FPGA), mm-wave data conversion. traveling-wave topology [5, 6], where analog input and clock
signal travel synchronously from comparator to comparator.
I. INTRODUCTION Fig. 1(b) exemplarily illustrates the signal distribution concept.
The strive towards Tbit/s communication will push the A linear input driver feeds the analog input signal with help of
sampling rates of analog-to-digital converters (ADCs) beyond a transmission line (TL) to a bank of parallel comparators. A
100 GS/s in the near future. Emerging 100G and 400G high-gain clock driver equivalently does this for the clock
coherent fiber-optic receivers, for instance, already necessitate signal. As the comparators are spatially apart from each other,
sampling rates greater than 50 GS/s [1-3]. In order to meet the input and clock signal do not arrive instantaneously at the
future speed requirements, a 4x time-interleaved 128 GS/s same time at all comparators, but rather successively with
4 bit ADC is being developed. This paper presents the design small time delays. The idea exploited in traveling-wave ADCs
and experimental test of one of its sub-ADCs. As a standalone is to keep the delays of the input and clock signal equal
chip, the sub-ADC can be employed in 100 Gbit/s wireless between adjacent comparators. This ensures that every
infrastructures such as proposed in [4] to enable digital signal comparator quantizes the same input signal value at each
processing (DSP) of wideband analog baseband receive sampling event, as illustrated in Fig. 1(b). Even though the
signals (>12 GHz bandwidth) with low modulation order. In comparators operate asynchronously due to the clock delays,
optical communication systems, it can be utilized to enable the same results can be obtained as with synchronous flash
DSP-based equalization of fiber-induced dispersion [5-7]. signal distribution approaches, where input and clock signal

(a) ADC architecture (b) Unary flash ADC core Termination


Traveling-wave Comparator 15
ENA_PRBS direction
ref ENA_SCR Transmission line T15
+ − VR15 .
.
2.4 V in Input driver .
15 4
50 Ω 50 Ω Gray encoder clk Clock driver T1
4 bit 4 ref Ref. ladder
+ flash 4 4+ 15 VR1 Out
MUX

in Scrambler D<0:3> V Comparator 1 @P1


− ADC −
core VR15 T15 0
211-1 PRBS VR14 T14 0
2.6 V T13 0
+ T12 0
P1 P1
...

50 Ω 50 Ω PRBS T11 1
− T10 1
+ + t T9 1
clk 1:64 FD DIV VR8 T8 1
− − Td,in T7 1
T6 1
...

T5 1
T4 1
VR3 T3 1
VR2 T2 1
VR1 Td,in = Td,clk T1 1
Td,clk
Sampling events of comparators 1 and 15

Fig. 1. ADC architecture (a) and unary flash ADC core (b).

978-1-5090-6383-3/17/$31.00 ©2017 IEEE 138


arrive instantaneously at all comparators (e.g., with active fifth and the signal lines on the second top metal layer. The
input and clock distribution trees [9]). However, since only distance, that the analog input and clock signal have to travel,
two transmission lines are required − one for the input and one is minimized by uniformly positioning the comparators in a U-
for the clock − the traveling-wave concept significantly shape. For 4 bit resolution 15 comparators are required. In the
consumes less area and has lower complexity. proposed architecture one additional dummy comparator is
For DSP the outputs of the ADC core are gray encoded. needed for symmetrical loading and delay matching of the
An integrated digital communication interface consisting of a bottom and top branches of the Y-split TLs.
scrambler, a 211-1 pseudo random bit sequence (PRBS) Signal feedthrough of the ADC input to the reference tap
generator, a multiplexer (MUX) and a 1:64 frequency divider voltages and clock kick-back can significantly degrade the
(FD) enables the storage of the ADC samples on a field ADC performance at high frequencies. To effectively address
programmable gate array (FPGA). The digital interface of the these error sources, the comparators integrate preamplifiers
ADC is required for external FPGA channel synchronization with signal feedthrough compensation [11], as shown in
and has the same operating modes as the interface in [10]. Fig. 2(b). By using differential quantization decision
thresholds, each reference tap voltage is exactly fed twice into
III. CIRCUIT COMPONENTS the U-shaped comparator array. Since both polarities of the
The linearity of the ADC is solely determined by the unary analog input signal are compared against each tap voltage,
flash core and the gray encoder. In the following a detailed feedthroughs of Vinp and Vinn over the base-emitter
overview of these core building blocks is given: Subsection A capacitances of the preamplifiers (e.g., of preamp 1 and 15)
elucidates the input and clock signal distribution of the flash compensate each other by first order. The settling of transient
core together with its comparator preamplifiers. Subsection B spikes within one conversion period is ensured by
elaborates the proposed low-complexity gray encoder. dimensioning the reference tap signals (VR1, VR2, … and VR15)
A. Flash ADC core with sufficiently low impedances. A high preamplifier gain
The traveling-wave flash ADC is implemented fully- and bandwidth (14 dB and 63 GHz simulated) is obtained by
differentially in current-mode logic (CML). Fig. 2(a) depicts employing a Cherry-Hooper topology with emitter follower
the flash core with its comparator placement. The input and (EF) feedback.
clock signal distribution are realized by two delay-matched B. Encoder
Y-split TLs. Their characteristic impedances are 70 Ω each To encode the 15 output signals (T1, T2, … and T15) of
and are obtained by EM and RLC-extracted simulations. Delay the ADC core, a two-stage gray encoder is used. Gray
matching is ensured by using equivalent TL structures that are encoding is favored over binary encoding, as it is less
approximately equally loaded at each TL tap. For minimum susceptible to bubble errors caused by metastable comparator
size microstrip lines are employed. The used process offers outputs. Fig. 3 depicts the simplified thermometer-to-gray
five metal and two thick metal layers. To prevent coupling to (T2G) encoder. The encoding is solely realized by Exclusive-
active circuitry, the ground planes of the TLs are placed on the OR (XOR) operations.

clk (a) Traveling-wave ADC core (b) Comparator preamplifiers


EF EF clk_T
EF
Gm driver (4.5 mA) Reference ladder
Master-Slave Voltage VCC Preamp 1 VR8 Preamp 15 VCC
driver RZ RZ
Flip-Flop RZ R2 R2 R R2 R2 RZ
(6 mA) 1 8
EF
EF
EF
EF
EF
EF
EF
EF

MS-FF T1
MS-FF T2
R1 R1 R1 R1
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF

Rf Rf Rf Rf
to Preamp 14
to Preamp 2

MS-FF T3 VR14
EF MS-FF T4 V 1n V1p V15n V15p
Clock VR2
MS-FF T5
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp

R
MS-FF T6
EF MS-FF T7 VR15
EF Analog MS-FF T8 Vinn Vinp VR1 Vinn Vinp
in EF input MS-FF Td
R
MS-FF T9
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp

ref MS-FF T10


EF Clock MS-FF T11
MS-FF T12
Gm drivers Vinn Vinp
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF
MS-FF

MS-FF T13 Vrefn Vrefp


for input
and clock MS-FF T14 VR15 ˷ VR15
(2x19 mA) ˷
MS-FF T15 VR8 Vinp Re Re VR8
EF
EF
EF
EF
EF
EF
EF
EF

15 d Vinn
VR1 VR1
Emitter (3 mA)
Follower Comparator t Input driver t
(3 mA) (15 mA)
EF EF clk_B

Fig. 2. Traveling-wave ADC core (a) and comparator preamplifiers with signal feedthrough compensation (b).

139
T12 T4 Dec Quantizer Outputs G3 G2 G1 G0
0 0 00000 00000 00000 0 0 0 0 CML-XOR Systematic jitter PXOR
1 00000 00000 00001 0 0 0 1 VCC 1 1
XOR 2 00000 00000 00011 0 0 1 1 R R
1 0 0
3 00000 00000 00111 0 0 1 0 XOR
e
0 VCC
ag
St

4 00000 00000 01111 0 1 1 0 R R

T1
2
Cp Cn Cp Cn Cp
e
5 00000 00000 11111 0 1 1 1

4 T10
ag

PXOR

XO
St

G2 6 00000 00001 11111 0 1 0 1 C

R
G1 Bn Vcasc

XOR
7 00000 00011 11111 0 1 0 0 Bp Bp
T8

0
G3 8 00000 00111 11111 1 1 0 0
G0

R
B Ap An Bp Bn Cp Cn

0
9 00000 01111 11111 1 1 0 1

XO
Ap An
XOR 10 00000 11111 11111 1 1 1 1

T6
11 00001 11111 11111 1 1 1 0 A A B C
XO R T2 I1 I1 I1 I1
R XO 12 00011 11111 11111 1 0 1 0
XOR VEE VEE
13 00111 11111 11111 1 0 1 1
T1

T3 15 14 01111 11111 11111 1 0 0 1


T5 T7 1T13 T Truth table
T9 T1 15 11111 11111 11111 1 0 0 0 State: 0 1 2 3 4 5 6 7
T15 T8 T1 A 0 0 0 0 1 1 1 1
Inputs B 0 0 1 1 0 0 1 1
Fig. 3. Thermometer-to-gray encoder with truth table. C 0 1 0 1 0 1 0 1
XOR 0 1 1 0 1 0 0 1
Outputs
PXOR 0 1 -1 0 1 2 0 1
For speed enhancement and power reduction of the −1 = −3R·I1 0 = −R·I1 1 = R·I1 2 = 3R·I1
encoder, each XOR gate in Fig. 3 is replaced by a proposed
Fig. 4. Classical three-input CML XOR vs. PXOR.
Pseudo-Exclusive-OR (PXOR) gate. The pseudo gate uses
folded-cascode differential logic [12] and is based on the converter data outputs and the scope sampling clock, the 1:64
following idea: Since the comparator outputs are unary coded, divided clock output provided by the ADC is used as the
the XOR operations in the encoder can alternatively also be trigger signal. The analog input and clock signal of the ADC
fully emulated with help of a folding amplifier, as depicted in are supplied by two signal generators which operate in a
Fig. 4. Even though the circuit does not implement a logic master and slave relationship and two broadband baluns
binary gate, it is used as one, as for unary coded input signals ranging from 300 kHz to 26.5 GHz and 300 kHz to 67 GHz.
(input states 0, 1, 3, 4, 6 and 7) the same output as for a logic Delay calibration between the four ADC outputs and the
XOR is obtained. This approach allows to realize the sampling scope due to different lengths of propagation paths
Exclusive-OR operation with minimum circuitry, as just three (e.g., on RF evaluation board, etc.) is achieved with deskew
folded-cascode differential pairs are required. If a classical operations provided by the scope and the PRBS mode of the
CML-XOR is used instead, seven transistor pairs on three converter which sends a coherent synchronous bit sequence to
transistor stack levels would be needed to achieve the same all four ADC outputs.
logic operation. As each input has to drive a different number The ADC dissipates 3.5 W while operating from a 3.5 V
of differential pairs and has to travel through a different and 3 V power supply. The unary ADC core consumes 1.4 W
number of transistor stack levels to get to the output, the and the encoder with the FPGA interface 2.1 W, whereby 57%
CML-XOR exhibits “split” bit transitions at its zero crossings, of the latter power is dissipated in the six retiming output
which at high speed can manifest as severe systematic jitter at drivers. Fig. 6 depicts the measured effective number of bits
the output (see eye diagram in Fig. 4 for unary coded inputs). (ENOB) and spurious free dynamic range (SFDR) of the ADC
In PXOR implementations, however, such problem is not at different sampling rates. To cover a 10 MHz step size with
prevalent, since each input drives the same capacitive load the sampling scope, the ADC input clock is set to integer
(one differential pair) and has the same input-to-output signal multiples of 640 MHz. Up to 39.04 GS/s the ADC provides an
path length. The PXOR thus enables operation at higher speed ENOB of more than 3 bits and a SFDR of more than 24.8 dBc
and lower power dissipation. Due to the low systematic jitter, over the complete frequency band from DC to 20 GHz. The
several PXORs can directly be cascaded without the need for frequency band for an effective resolution better than 3 bits is
data retiming flip-flops in-between. 12 GHz at 40.32 GS/s and about 5.3 GHz at 42.24 GS/s. The
The PXOR-based gray encoder generates glitch errors of differential nonlinearity (DNL) for a 10 MHz input signal is
no more than 3 LSB magnitude, if first-order bubble errors between −0.20 and 0.20 LSB and the integral nonlinearity
with depth −2, −1, 0, 1 or 2 occur. In case of lower or higher (INL) between −0.32 and 0.35 LSB up to 42.24 GS/s, as
depths or higher-order bubble errors, the glitch magnitudes shown in Fig. 7.
increase and the encoder limits the linearity of the ADC. 1.4 mm

IV. EXPERIMENTAL RESULTS + D0 -


-D
1

+ DIV - - PRBS +
- clk +

Fig. 5 depicts the die photograph of the ADC. The


converter is implemented in a 0.13 µm SiGe BiCMOS
0.9 mm

technology from IHP which features 300 GHz fT and 500 GHz
- in +

fmax for its HBTs. The ADC including pads occupies a die area
of 1.4 x 0.9 mm2 and is wire-bonded on a RF PCB. Due to
speed limitations of available FPGA transceivers
-r

+
ef

(28.05 Gbit/s), the ADC performance is measured with a four-


-D

+ D3 -
+

channel sub-sampling oscilloscope with 70 GHz analog Fig. 5. Wire-bonded ADC on RF PCB and die photograph.
bandwidth. To ensure a common time base for the four

140
TABLE 1. Comparison of mm-wave SiGe ADCs.
This
[9] [13] [14] [3]
work
Architecture Flash Flash1 Flash1 Folding Flash
Digital encoder Yes No No Yes Yes
Time-interleaved No No No Yes, 4x Yes, 2x
Sampling
40.32 35 40 30 50
rate (GS/s)
Resolution (bits) 4 4 3 6 5
33/1 29/1 28/1 38/1 35/2
SFDR (dBc)
30/12 27/11 18/14 37/10 34/10
/fin (GHz)
24/20 16/15 11/19 35/16 27/22
3.7/1 3.7/1 2.8/0.05 5.1/1 4.1/2
ENOB (bits)
3.0/12 3.0/11 2.02/14 3.9/10 3.7/10
/fin (GHz)
2.8/20 1.6/15 1.22/19 3.5/16 3.4/22
Power (W) 2.33/3.5 4.53/5 3.83/4.5 8.5 5.4
FOM (pJ/cs) 8.33/20
25.63/11 33.92,3/14 25.0/16 11.6/22
/fin (GHz) 12.6/20
Die area (mm2) 1.3 8.0 4.0 13.6 10.2
SiGe SiGe SiGe SiGe
SiGe
Technology BiCMOS BiCMOS BiCMOS BiCMOS
120 nm
Fig. 6. Measured dynamic ADC performance at different sampling rates. 130 nm 180 nm 180 nm 180 nm
1 2 3
unary ADC-DAC ENOBestimated ≈ SFDR/9 w/o output drivers or DAC

any kind of calibration at a low energy efficiency of 8.3 pJ/cs,


which is better than the figure of merit (FOM) of best reported
SiGe ADCs with conversion rates of 25 GS/s and higher [3, 9,
13, 14]. The FOM in this work is defined as the power
consumption of the ADC divided by the product of 2 ENOB and
min(fsampling, 2fin). The proposed data converter enables mm-
wave data rates without exploitation of area-consuming on-
Fig. 7. Measured INL and DNL performance at different sampling rates. chip inductors. Its high sampling rate, good linearity and small
die size make it a good candidate for time-interleaving ADCs.
V. COMPARISON AND CONCLUSION
ACKNOWLEDGMENT
TABLE 1 compares the measured converter performance
against fastest published mm-wave SiGe ADCs from recent This research was supported by the German Research
years. Even though single-core unary ADC-DAC Foundation (DFG priority program SPP 1655, grant BE
2256/19-1) within the project “Development of Novel System
combinations have already been demonstrated beyond 30 GS/s
and Component Architectures for Future Innovative 100 Gbit/s
[9, 13], but with considerably strong ENOB and SFDR Communication Systems”.
degradations at high frequencies, moderate energy efficiencies
and large chip sizes, full ADC operation including digital REFERENCES
encoding has never been shown at such high sampling rates [1] Y. Greshishchev et al., “A 40GS/s 6b ADC in 65nm CMOS”, IEEE ISSCC, 2010,
yet. At conversion rates over tens of GS/s, the encoder can be pp. 390-391.
[2] I. Dedic, “56GS/s ADC Enabling 100GbE”, IEEE OFC, 2010, pp. 1-3.
the bottleneck of the ADC [5]. If, for instance, a first-order [3] J. Lee and Y. Chen, "A 50-GS/s 5-b ADC in 0.18-um SiGe BiCMOS", IEEE MTT-S
IMS, 2010, pp. 900-903.
bubble error of depth 3 occurs, the glitch error caused by gray [4] C. Carlowitz and M. Vossiek, "Concept for a novel low-complexity QAM
encoding can already be 5 LSB. In a unary ADC-DAC transceiver architecture suitable for operation close to transition frequency", IEEE
MTT-S IMS, 2015, pp. 1-4.
combination [9, 13], however, the same bubble error only [5] H. Nosaka et al., "A 24-Gsps 3-bit Nyquist ADC using InP HBTs for electronic
dispersion compensation", IEEE MTT-S IMS, 2004, pp. 101-104.
causes a 1 LSB glitch, as the unary coded comparator outputs [6] P. Schvan, D. Pollex, S.-C. Wang, C. Falt and N. Ben-Hamida, "A 22GS/s 5b ADC
are directly converted back to the analog domain without in 0.13 µm SiGe BiCMOS", IEEE ISSCC, 2006, pp. 2340-2349.
[7] R. A. Kertis et al., "A 20 GS/s 5-Bit SiGe BiCMOS Dual-Nyquist Flash ADC With
digital data encoding in-between. To the best knowledge of the Sampling Capability up to 35 GS/s Featuring Offset Corrected Exclusive-Or
Comparators", IEEE JSSC, vol. 44, no. 9, pp. 2295-2311, Sept. 2009.
authors, the fastest reported single-core ADC with digital [8] M. Grözing, H. Huang, X.-Q. Du and M. Berroth, “Data Converters for 100 Gbit/s
encoder is implemented in SiGe BiCMOS and provides a Communication Links and beyond”, IEEE SiRF, 2016, pp. 104-106.
[9] S. Shahramian, S. P. Voinigescu and A. C. Carusone, "A 35-GS/s, 4-Bit Flash ADC
maximum conversion speed of 25 GS/s [3]. This paper With Active Data and Clock Distribution Trees", IEEE JSSC, vol. 44, no. 6, pp.
presents the world’s fastest single-core ADC with PXOR gray 1709-1720, June 2009.
[10] M. Buck et al., " A 6-GS/s 9.5-b Single-Core Pipelined Folding-Interpolating ADC
encoder that can operate up to 42.24 GS/s. Its sampling rate With 7.3 ENOB and 52.7-dBc SFDR in the Second Nyquist Band in 0.25-µm SiGe-
BiCMOS", IEEE T-MTT, vol. 65, no. 2, pp. 414-422, Feb. 2017.
surpasses prior published single-core ADCs with digital [11] R. C. Taft, C. A. Menkus, M. R. Tursi, O. Hidri and V. Pons, "A 1.8-V 1.6-
encoder by more than 50% [3, 5-7]. The implemented ADC GSample/s 8-b self-calibrating folding ADC with 7.26 ENOB at Nyquist
frequency", IEEE JSSC, vol. 39, no. 12, pp. 2107-2115, Dec. 2004.
provides more than 3 bits effective resolution up to 39.04 GS/s [12] K. Ono, T. Matsuura, E. Imaizumi, H. Okazawa and R. Shimokawa, "Error
suppressing encode logic of FCDL in a 6-b flash A/D converter", IEEE JSSC, vol.
for input signal frequencies between DC and 20 GHz. At 32, no. 9, pp. 1460-1464, Sept. 1997.
40.32 GS/s the measured ENOB for a 20 GHz full-scale input [13] W. Cheng et al., "A 3 b 40 GS/s ADC–DAC in 0.12 µm SiGe", IEEE ISSCC, 2004,
pp. 262-263.
signal is still 2.8 bits. This performance is achieved without [14] D. Wu et al., "A 30GS/s 6bit SiGe ADC with input bandwidth over 18GHz and full
data rate interface", IEEE BCTM, 2016, pp. 90-93.

141

You might also like