You are on page 1of 29

ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / OVERVIEW

Session 3 Overview: Amplifiers and Oscillators


ANALOG SUBCOMMITTEE

Session Chair: Jens Anders Session Co-Chair: Shon-Hang Wen


University of Stuttgart, Germany MediaTek, Taiwan

This session highlights advances in state-of-the-art amplifiers and oscillators for several applications. The first three papers focus on improving
DR in a Class-D amplifier, lowering the noise and input current in a chopper-stabilized opamp and generating a single tone with the lowest THD
for ADC testing. RC oscillators achieve high accuracy with a high-resolution trimming and aging compensation scheme, and crystal oscillators
with reduced startup time and low energy, a wide acceptable injection clock frequency error, and low power sensitivity to temperature are
demonstrated. A PLL-less BAW-based oscillator with digital calibration achieves high frequency stability and low jitter.

1:30 PM
3.1 A 120.9dB DR, -111.2dB THD+N Digital-Input Capacitively-Coupled Chopper Class-D Audio Amplifier
Huajun Zhang, Delft University of Technology, Delft, The Netherlands
In Paper 3.1, Delft University of Technology and Goodix Technology present a digital-input capacitively-coupled Class-D amplifier.
It achieves 120.9dB DR and -111.2dB peak THD+N, and can deliver 13W/23W at 10% THD into an 8Ω/4Ω load with 90%/86%
efficiency.

2:00 PM
3.2 A Chopper-Stabilized Amplifier with a Relaxed Fill-In Technique and 22.6pA Input Current
Thije Rooijers, Delft University of Technology, Delft, The Netherlands, now at Broadcom, Bunnik, The Netherlands
In Paper 3.2, Delft University of Technology presents a chopper-stabilized amplifier with a relaxed fill-in technique. By introducing
a duty-cycled non-chopped fill-in OTA and a ripple reduction loop, this work achieves a 25× reduction in input current, while
achieving a flat noise floor of 12nV/√Hz.

2:15 PM
3.3 Bandpass Filter and Oscillator ICs with THD < -140dBc at 10Vppd for Testing High-Resolution ADCs
Subha Sarkar, Texas Instruments, Bangalore, India, Indian Institute of Technology Madras, Chennai, India
In Paper 3.3, Texas Instruments and IIT Madras present bandpass filter and oscillator ICs with a new-benchmark THD performance
at 10Vppd swing for testing high-resolution ADCs. It is achieved by incorporating capacitor nonlinearity cancellation and opamp
output conductance nonlinearity suppression techniques into an active-RC bandpass filter and oscillator.

52 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 1:30 PM

2:30 PM
3.4 A 0.01mm2 10MHz RC Frequency Reference with a 1-Point On-Chip-Trimmed Inaccuracy of ±0.28% from
-45°C to 125°C in 0.18μm CMOS
Hui Jiang, Silicon Integrated, Eindhoven, The Netherlands 3
In Paper 3.4, Delft University of Technology, Silicon Integrated B.V. and Tsinghua University present a 0.01mm2 10MHz RC
frequency reference with high-resolution on-chip trimming of both its temperature coefficient and absolute frequency. This
work achieves a ±0.28% inaccuracy from −45°C to 125°C after 1-point trim.

3:15 PM
3.5 A 1.4μW/MHz 100MHz RC Oscillator with ±1030ppm Inaccuracy from -40°C to 85°C After Accelerated Aging
for 500 Hours at 125°C
Kyu-Sang Park, University of Illinois, Urbana, IL
In Paper 3.5, the University of Illinois at Urbana-Champaign presents a 100MHz RC oscillator in 65nm CMOS that achieves
±1030ppm frequency inaccuracy from -40°C to 85°C after accelerated aging for 500 hours at 125°C. This performance is
achieved with aging compensation by periodically locking the oscillator to a less-aged reference oscillator.

3:45 PM
3.6 A 12/13.56MHz Crystal Oscillator with Binary-Search-Assisted Two-Step Injection Achieving 5.0nJ Startup
Energy and 45.8µs Startup Time
Haihua Li, University of Macau, Macau, China
In Paper 3.6, the University of Macau and Instituto Superior Tecnico present a 12/13.56MHz crystal oscillator with 45.8μs
startup time and 5nJ startup energy. It is achieved by introducing binary-search-assisted frequency locking to a 2-step injection
method, implemented using a resettable fast-settling auxiliary DCO, edge aligner, frequency comparator, and control logic.

4:15 PM
3.7 A 16MHz XO with 17.5μs Startup Time Under 104ppm-ΔF Injection Using Automatic Phase-Error Correction
Technique
Xin Wang, Nanjing University of Posts and Telecommunications, Nanjing, China
In Paper 3.7, Nanjing University of Posts and Telecommunications and Hefei University present a 16MHz crystal oscillator in
40nm CMOS, which uses an automatic phase-error correction technique to achieve a 17.5μs startup time with an injection
clock frequency error of up to 104 ppm. The startup energy is 9.2nJ, and the startup time variation over temperature is ±4.5%.

4:30 PM
3.8 A 0.954nW 32kHz Crystal Oscillator in 22nm CMOS with Gm-C-Based Current Injection Control
Yihan Zhang, Peking University, Beijing, China
In Paper 3.8, Peking University and the Advanced Institute of Information Technology of Peking University present a crystal
oscillator with a Gm-C-based regulated current-injection technique, achieving very low power sensitivity to temperature of
0.017nW/°C.

4:45 PM
3.9 A 0.5-to-400MHz Programmable BAW Oscillator with Fractional Output Divider Achieving 4ppm Frequency
Stability over Temperature and <95fs Jitter
Subhashish Mukherjee, Texas Instruments, Bangalore, India
In Paper 3.9, Texas Instruments and IIT Madras present a 0.5-to-400MHz programmable BAW-based oscillator employing a
new temperature/supply-insensitive Dual-Slope Fractional Output Divider architecture. It achieves ±4ppm frequency stability
over -40°C to 85°C with <95fs rms phase jitter.

DIGEST OF TECHNICAL PAPERS • 53


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.1
3.1 A 120.9dB DR, -111.2dB THD+N Digital-Input The CCCA amplifies VERR (=DINVREF−VOUT/8) with a gain of 8× (=32×8CU/CGAIN, since D2
Capacitively-Coupled Chopper Class-D Audio Amplifier does not contain the input signal), which attenuates the noise contribution from the rest
of the loop filter by 18dB, while also ensuring that the CCCA does not clip due to the
DAC images when a 20kHz full-scale input is applied. A total DAC capacitance (288CU)
Huajun Zhang1, Marco Berkhout2, Kofi A. A. Makinwa1, Qinwen Fan1 of 3.5pF is used such that the parasitic capacitance at the summing node does not
degrade the feedback factor around the CCCA opamp. This implies a unit capacitance CU
1
Delft University of Technology, Delft, The Netherlands of 12fF, which, together with CFB and CGAIN, is implemented with MOM capacitors due to
2
Goodix Technology, Nijmegen, The Netherlands their high voltage rating.

Class-D amplifiers (CDAs) are widely used in audio applications where a high power Unit-element mismatch within the two sub-DACs is addressed with real-time (RT) DEM
efficiency is required. As most audio sources are digital nowadays, implementing digital- [6], which produces no idle tones and achieves better SNDR at the chosen OSR (=19.2)
input CDAs results in higher levels of integration and lower cost. However, prior than data-weighted averaging (DWA). Furthermore, since the DAC elements are driven
open-loop digital-input CDAs suffer from high jitter sensitivity and output-stage by a PWM-like signal, their individual mismatch contribution is reduced at small input
distortion. In [1], jitter sensitivity at small signal levels is mitigated using a buck-boost levels as the dutycycle of their PWM inputs approaches 50%. With RTDEM, however,
converter that adaptively lowers the supply at the expense of extra external components code changes larger than 1 may cause nonlinear ISI [6], which will happen quite often
and reduced power efficiency. Prior closed-loop digital-input CDAs employing multi-bit since the DAC input is chopped. In this work, this source of nonlinearity is also eliminated
current-steering [2] or resistive [3] DACs are less sensitive to jitter, but their DR is limited by the DB. The timing control scheme to align RTDEM, DB, and chopping in both LV and
to about 115dB. DAC non-idealities and intermodulation distortion are also challenges, HV domains is shown in Fig. 3.1.3. A 49.92MHz (=65fS) master clock (MCLK) is
and prior works only achieved a peak THD+N of about −98dB [2,3]. This paper presents employed to define the DB and control the timing of RTDEM, where 1 MCLK cycle is
a digital-input CDA that achieves high DR by combining a low-noise capacitive DAC allocated to the DB and 64 cycles to RTDEM. The number of transitions in the remaining
(CDAC) with dedicated techniques to mitigate DAC mismatch, ISI, and intermodulation time is signal-independent and thus distortion-free. RTDEM is realized using a cyclic
distortion. A prototype implemented in a 0.18μm BCD process achieves 120.9dB DR and shift register that loads thermometer-coded input data in parallel, which is then rotated
−111.2dB peak THD+N. Furthermore, it can deliver 13W/23W at 10% THD into an 8Ω/4Ω to ensure that every DAC element is used equally outside the DB. Since the CDA has a
load with a 90%/86% efficiency. nominal full-scale output of ±14.4V, the feedback chopper is realized with LDMOS
switches that must be driven through level shifters, which have ~2ns delay. To avoid
To avoid the thermal and/or 1/f noise of current-steering or resistive DACs, a CDAC can high-voltage transients, timing skew between the feedback chopper and DAC is
be used to drive a closed-loop CDA based on the capacitively-coupled chopper-amplifier minimized using a replica level shifter [4]. As a result, the DAC code transition (φDAC),
(CCCA) topology presented in [4]. Potential intermodulation between the DAC output HV, and LV chopping transitions (φCHHV and φCHLV) are aligned and fully covered by the
waveform, which contains DAC images around multiples of fS as well as shaped DB.
quantization noise, and the various chopping and PWM tones must then be carefully
mitigated. Figure 3.1.1 (top) shows an architectural overview of this capacitively-coupled The capacitively-coupled digital-input CDA is prototyped in a 0.18μm BCD process and
chopper digital-input CDA. A 24-bit digital input is up-sampled to fS=768kHz (16×48kHz), occupies an area of 7.5mm2 (Fig. 3.1.7). An Audio Precision APx555 audio signal analyzer
reduced to 8 bits by a 6th-order digital ΔΣ modulator (DSM1), and then converted into provides the 24-bit digital input and captures the CDA output. For flexibility, the
the analog domain by a CDAC. The latter drives a closed-loop CDA with an embedded interpolation filter and digital DSMs are implemented in an FPGA. The RTDEM and timing
CCCA front-end, a 14.4V 3-level PWM-based output stage, and feedback after the LC logic (Fig. 3.1.3) is implemented on-chip and consumes about 460μW from a 1.8V
filter, enabling low noise and suppressing LC filter nonlinearity [4]. To compensate for supply. Figure 3.1.4 (top) plots the output spectrum when the CDA drives an 8Ω load
the LC filter’s phase shift, the loop filter must implement at least one zero, which with a −10dBFS sinewave input at 1kHz, corresponding to 1W of output power. The
inevitably causes some overshoot in its response to DAC transitions. At large signal measured THD+N is −108.6dB. The output spectrum at −60dBFS is shown in Fig. 3.1.4
levels, this will saturate the output stage and thus reduce the CDA’s linear output range. (bottom), where an SNR of 60.9dB is achieved, indicating that this CDA has a DR of
To keep the overshoot small, an 8-bit DAC is used, resulting in only a 0.5dB loss in the 120.9dB. Figure 3.1.5 (top) shows the measured THD+N across output power for a 1kHz
CDA’s linear output range. Together with DSM1, the DAC achieves an SQNR of 136dB, input. The peak THD+N is −111.2dB and −106.6dB for 8Ω load and 4Ω load, respectively.
a maximum stable amplitude (MSA) of 0.99FS, and a signal-to-jitter-noise ratio (SJNR) The output power at 10% THD is 13W and 23W for 8Ω load and 4Ω load, respectively.
of 131.5dB when driven by a 768kHz clock with 100ps of white clock jitter. Figure 3.1.5 (bottom) plots the THD+N vs. input frequency.

Chopping mitigates the CDA’s 1/f noise and is performed at fCHOP=fS/2 to exploit the Figure 3.1.6 compares the performance of this work with other state-of-the-art digital-
spectral nulls in the shaped quantization noise at multiples of fS and, thus, avoid noise input CDAs. It is the only capacitively-coupled digital-input CDA. Compared to other
folding. As shown in Fig. 3.1.1 (bottom), the chopping and DAC input transitions are high-voltage (>10V) CDAs, it achieves the best peak THD+N (14B lower than [3]), the
aligned. After each transition, the CCCA will slew briefly before settling. To eliminate the highest dynamic range (5.4dB higher than [3]), and the lowest A-weighted integrated
resulting nonlinearity and DAC ISI, a dead-band (DB) is introduced, which starts just output noise (2× lower than [3]).
before the chopping and DAC code transitions. During the DB, the CCCA is briefly
disconnected from the rest of the loop filter. However, the resulting sample-and-hold Acknowledgement:
operation also folds down (thermal) noise around integer multiples of fS. A 20ns DB is The authors would like to thank Z. Chang, L. Pakula, and R. van Puffelen from the Delft
chosen as a compromise between CCCA settling and noise folding. The loop filter output University of Technology for measurement assistance.
is re-modulated by a 3-level analog PWM modulator switching at fPWM=4.992MHz, which
is an odd (13th) harmonic of fCHOP. This avoids intermodulation distortion between the References:
chopping and PWM sidebands [4]. However, this also means that fPWM (=6.5fS) is not [1] W. H. Sun et al., “A 121dB DR, 0.0017% THD+N, 8× Jitter-Effect Reduction Digital-
located at a multiple of fS, so some quantization noise folding will occur. Fortunately, the Input Class-D Audio Amplifier with Supply-Voltage-Scaling Volume Control and
quantization noise around fPWM is attenuated by the sinc roll-off of the DAC spectrum Series-Connected DSM,” ISSCC, pp. 486-487, Feb. 2022.
and the lowpass characteristics of the CDA’s STF, so the folded noise is negligible [2] A. Matamura et al., “An 82mW ΔΣ-Based Filter-Less Class-D Headphone Amplifier
(< −150dBFS). with -93dB THD+N, 113dB SNR and 93% Efficiency,” ISSCC, pp. 432-433, Feb. 2021.
[3] E. Cope et al., “A 2×20W 0.0013% THD+N Class-D Audio Amplifier with Consistent
To reduce the complexity of the DEM scheme needed to achieve high linearity, the 8-bit Performance up to Maximum Power Level,” ISSCC, pp. 56-57, Feb. 2018.
DAC is segmented. As shown in Fig. 3.1.2, the input of the MSB segment (D1) is produced [4] H. Zhang et al., “A 121.4dB DR, -109.8dB THD+N Capacitively-Coupled Chopper
by a second digital modulator (DSM2), while the LSB segment is driven by the shaped Class-D Audio Amplifier,” ISSCC, pp. 484-485, Feb. 2022.
quantization error (D2) [5]. Ideally, no input-related content should be present in D2 such [5] R. Adams et al., “A 113 dB SNR Oversampling DAC with Segmented Noise-Shaped
that the gain mismatch of the two DAC segments contributes only shaped noise. In [2,5], Scrambling,” IEEE JSSC, vol. 33, no. 12, pp. 1871-1878, Dec. 1998.
a 1st-order DSM is used, which could produce idle tones at small signal levels, leading [6] S. -H. Wen et al., “A -117dBc THD (-132dBc HD3) and 126dB DR Audio Decoder
to harmonic content in D2. The gain mismatch between the segments will then allow with Code-Change-Insensitive RT-DEM Algorithm and Circuit Technique for Relaxing
some of this content to leak into the output. In this work, a 2nd-order DSM is used to Velocity Saturation Effect of Poly Resistors,” ISSCC, pp. 482-483, Feb. 2022.
alleviate the idle tone issue. This also reduces quantization noise leakage by about 20dB
compared to a 1st-order DSM2. A 2-bit overlap is introduced between the 2 segments to
accommodate the extra swing caused by the shaped quantization noise of DSM2. D1 and
D2 thus drive two sub-DACs with 8× and 1× weights, respectively. The (digitally) chopped
DAC output DINφCHVREF is applied to a CCCA, which forms the loop filter’s error amplifier.

54 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 1:30 PM

Figure 3.1.1: Architecture of the proposed capacitively-coupled digital-input Figure 3.1.2: Simplified schematic of the 8-bit CDAC with noise-shaped segmentation
Class-D audio amplifier. and the capacitively-coupled summing node of the loop filter.

Figure 3.1.3: (top) Timing control circuitry of the CDAC, RTDEM, and chopping, and
(bottom) its timing diagram (for a 9-level sub-DAC, the actual implementation uses
two 33-level sub-DACs). Figure 3.1.4: Measured output spectra (256k-point FFT, 4× averaged).

Figure 3.1.5: Measured THD+N (top) at 1kHz across output power, and (bottom) Figure 3.1.6: Performance summary and comparison with state-of-the-art monolithic
across input frequency. high-voltage (>10V) digital-input CDAs.

DIGEST OF TECHNICAL PAPERS • 55


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.1.7: Die micrograph.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.2
3.2 A Chopper-Stabilized Amplifier with a Relaxed Fill-In just significantly reduced (by 11×) to mitigate the change in the various biasing voltages.
Technique and 22.6pA Input Current Second, a lowpass filter at the gates of the switched bias current sources of Gm2 is used
to slow down the bias-current transitions. Finally, these bias sources are isolated from
Thije Rooijers1,2, Johan H. Huijsing1, Kofi A. A. Makinwa1 the main bias-current generator and other blocks by an extra layer of current mirrors.
Simulations show that these measures ensure that the resulting spikes are well below
1
Delft University of Technology, Delft, The Netherlands, the noise floor, while incurring only a small power penalty. Switching the bias current of
2
now at Broadcom, Bunnik, The Netherlands the fill-in OTA with a 20% duty-cycle (to allow sufficient settling) then results in a 76%
power saving.
In chopper amplifiers, the interaction between the input signal and the chopper clock
can cause intermodulation distortion (IMD). This is due to amplifier delay, which causes The opamp is realized in a 0.18μm CMOS BCD process (Fig. 3.2.7) and has an active
signal transitions generated by the input chopper to arrive at the amplifier’s output slightly area of 0.57mm2. It draws 620μA from a 5V supply, which drops to 530μA when the fill-
later than the corresponding clock transitions of the output chopper. This causes large in OTA is duty-cycled, a 15% power saving. The power breakdown (Fig. 3.2.5) shows
signal-dependent spikes in the final output, which can significantly degrade amplifier that the contribution of the fill-in OTA is then only 10%. The opamp’s voltage noise
linearity, especially at input frequencies near even multiples of the chopping frequency density is shown in Fig. 3.2.3 without chopping, with chopping and with the fill-in OTA
FCH, which will cause IMD tones near DC. In [2-4], spread-spectrum clocks are used to turned “on” and “off” and compared with the extracted voltage noise density of [1].
convert such tones into noise-like signals. However, this increases the noise floor, Without chopping, the opamp has a 1/f noise corner frequency of about 2kHz. With
without solving the underlying problem. Recently, it has been shown that such spikes chopping, the use of an RRL eliminates the noise bump seen in [1], resulting in a flat
can be eliminated by using the fill-in technique [1], in which two identical OTAs are noise floor with a lower (12nV/√Hz) spectral density. The measured 1/f corner is below
chopped in quadrature, allowing a spike-free output to be generated by switching 10Hz and is not affected when the fill-in OTA is enabled. Furthermore, no extra tones are
between their outputs in a ping-pong fashion. created when it is duty-cycled, confirming the effectiveness of the various spike-
mitigation measures. Some crosstalk from the external clock can be observed at 20, 60
Figure 3.2.1 illustrates how the delay of a chopped main OTA (Gm1) causes large spikes
and 80kHz.
in its output current (Iout1) around the chopping transitions. By switching to the output
of a fill-in OTA, a spike-free output current can be generated, thus mitigating chopper- A step response measurement shows the opamp has a slew rate of 2V/μs (up) and
induced IMD. In [1], this switching was done with a 50% duty-cycle, and so two chopped 1.4V/μs (down), with no extra ringing due to the RRL. Measurements on 15 samples
OTAs with the same low offset and 1/f noise are required. In this work, the fill-in OTA is with a 2.5V input CM voltage and FCH = 20kHz, show that the opamp’s offset does not
only used briefly, greatly relaxing its offset and 1/f noise requirements and obviating the exceed 0.8μV and that its input current remains below 4pA (Fig. 3.2.3 Bottom). Enabling
need for chopping. This approach also saves power, since the relaxed fill-in OTA can be and disabling fill-in only changes the offset slightly, confirming that not chopping the
turned “off” most of the time. Furthermore, the reduced input switching activity compared fill-in OTA does not significantly worsen the overall offset.
to [1], and the use of non-overlapping chopper clocks results in more than 25× less input The input current at mid-supply is about 4pA, which, as predicted in [6], is roughly 4×
current. larger than that of an AZ-stabilized amplifier realized in a similar process [6].
To mitigate the ripple at the chopping frequency FCH caused by their up-modulated offset, Measurements show that the input current at mid-supply increases linearly with FCH,
the chopped OTAs in [1] were also auto-zeroed. However, the noise-folding inherent to indicating that it is mainly due to the charge injection mismatch of the input chopper
auto-zeroing then causes a noise “bump” around FCH. In this work, since the fill-in OTA switches. Figure 3.2.3 shows the input current vs input voltage characteristic of three
is not chopped, only the offset of the main OTA needs to be reduced. This is achieved by samples: a typical sample, and two worst-case samples. All three draw more input
a continuous-time ripple-reduction loop (RRL), which does not suffer from noise folding current at low input voltages. Measurements of an un-connected pad show a similar
and thus, results in a flat noise spectrum. trend, indicating that most of the input current at low input voltages is due to ESD diode
leakage.
A simplified block diagram of the proposed chopper-stabilized amplifier is shown in Fig.
3.2.2. It consists of a main amplifier (AMAIN), whose offset (and 1/f noise) will appear With the opamp configured as a buffer, a single 1Vrms 79kHz (~4FCH) input tone results
between its input terminals when it is used in a negative feedback configuration, where in the output amplitude spectrum shown in Fig. 3.2.4 (Top). Without the fill-in technique
it can be sensed and corrected by a chopped auxiliary amplifier. In this work, the offset (left), a large -102dB IMD tone is present at 1kHz (4FCH-Fin). With fill-in enabled, this
of a two-stage main amplifier (folded-cascode 1st stage and Class AB 2nd stage) is drops by 24dB, to -125.7dB (right). Measurements on 5 samples show that the achieved
suppressed by a three-stage auxiliary amplifier. To mitigate its own offset (Vos1), the IMD spreads between 123dB and 134.4dB, demonstrating good robustness. With a
auxiliary amplifier employs a chopped OTA (Gm1, folded-cascode), followed by an similar 39kHz tone (~2FCH), the IMD tone is –112.8dB without fill-in and -128.5dB with
integrator (GmINT, folded-cascode, Cint1,2 = 36pF), and a correction OTA (GmCOR, telescopic). fill-in, a 16dB improvement. At lower input frequencies (<5kHz), the IMD tones are below
The fill-in OTA (Gm2) is a non-chopped replica of Gm1. Simulations show that its offset the -140dB noise floor. When two input tones are applied (79 and 80kHz, 0.5Vrms each),
(< 1mV) has a negligible effect on the residual offset of the overall amplifier and causes the resulting amplitude spectrum is shown in Fig. 3.2.5. Without chopping, the IMD at
a small (< 3μVrms) tone at 2FCH (40kHz). Similarly, its 1/f noise also has a negligible effect, 1kHz is –112.4dB, which increases to –106.9dB with chopping and with fill-in disabled.
resulting in an overall 1/f noise corner of only 2Hz in simulation. Enabling fill-in restores the IMD to –112.4dB, demonstrating that it effectively suppresses
chopper-induced IMD. Without fill-in, measurements show that the residual ripple
The choppers are driven by a constant-VGS clock generator to ensure that their charge
amplitude at 2FCH is around 0.7μVrms. With fill-in, this increases to 2.5μVrms for a worst-
injection mismatch, and hence the resulting residual offset and input current, is
case sample with the largest fill-in OTA offset.
insensitive to input voltage. The clock generator also generates the non-overlapping
clocks needed to guarantee that the switches of the input chopper are never all turned Figure 3.2.6 summarizes the opamp’s performance and compares it to the state-of-the-
“on”, thus eliminating a potential source of input current. art. It achieves similar IMD (-125.7dB @ 79kHz ~4FCH), with a much simpler architecture.
Among the chopper amplifiers, it achieves the lowest input current (22.6pA max), only
The RRL consists of two capacitors Cs1,2 (3.6pF) that sense the triangular ripple caused
beaten by an AZ amplifier [5], with a trimmed input current and much higher IMD
by Vos1 at the output of the integrator formed by GmINT and Cint1,2. The resulting current is
(-44dB). Compared to [1], it achieves 25× less input current and a lower flat white noise
then demodulated and integrated by the RRL integrator formed by GmINT1 and Cint3,4
level (12nV/√Hz) at similar supply current levels.
(9pF each). Its output is applied to GmRRL, which cancels Vos1 by injecting a correction
current into Gm1. Since the amplitude of the ripple is limited by the offset of GmINT1, this References:
is auto-zeroed with the help of CAZ1 and CAZ2 (4pF each). [1] T. Rooijers et al., “A Fill-In Technique for Robust IMD Suppression in Chopper
Amplifiers,” IEEE JSSC, vol. 56, no. 12, pp. 3583-3592, Dec. 2021.
However, the RRL can also create chopper-induced IMD, since the signal transitions [2] Analog Devices Inc., “AD8551 data sheet”,
caused by its ripple-demodulating choppers are delayed by the RRL integrator before 1999,<http://www.analog.com/media/en/technical-documentation/data-
they reach the output choppers of Gm1. In this design, this extra source of IMD is sheets/AD8551_8552_8554.pdf>.
suppressed in two ways. First, the contribution of the RRL to the output current of Gm1 [3] Analog Devices Inc., “AD8571 data sheet”, 1999,
is minimized by using a large Gm1/GmRRL ratio. However, this is limited to ~600× by the <http://www.analog.com/media/en/technical-documentation/data-
swing of GmINT1 and the expected magnitude of Vos1. In simulation, this limits the resulting sheets/AD8571_8572_8574.pdf>.
IMD to -100dB. To lower this further, a sample-and-hold is used to freeze the input of [4] V. Ivanov and M. Shaik, “A 10MHz-Bandwidth 4μs-Large-Signal-Settling 6.5nV/√Hz-
GmRRL just before each chopping transition. Further lowpass filtering is achieved by using noise 2μV-Offset Chopper Operational Amplifier,” ISSCC, pp. 88-89, Feb. 2016.
a small sampling capacitor Cs (=0.5pF) to drive a larger hold capacitor CH (=7.2pF). These [5] T. Rooijers et al., “An Auto-Zero Stabilized Voltage Buffer with a Trimmed Input
measures ensure that the overall IMD is not limited by the RRL. Current of 0.2pA,” ESSCIRC, pp. 257-260, Sept. 2019.
Since Gm2 is only used briefly, it can be turned “off” most of the time. To ensure that the [6] T. Rooijers et al., “An Auto-Zero-Stabilized Voltage Buffer with a Quiet Chopping
process of turning it “on” and “off” does not itself cause input spikes and more distortion, Scheme and Constant Sub-pA Input Current,” IEEE JSSC, vol. 57, no. 8, pp. 2438-2448,
three measures are taken. First, the OTA’s bias current is not completely turned off, but Aug. 2022.

56 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 2:00 PM

Figure 3.2.1: Fill-in implementation with two chopped OTAs and multiplexing
switches (top left), the chopping signals and the resulting output current with spikes Figure 3.2.2: Simplified block diagram of the proposed Chopper-Stabilized
(bottom) and the timing diagram for two implementations of multiplexing (top right). Operational Amplifier.

Figure 3.2.3: Voltage noise density vs frequency (top left) and input current vs input Figure 3.2.4: Measured amplitude spectrum (10 Averages) with a single-tone test
voltage (top right). Histogram of the offset and input current at 2.5V for 15 samples for Fin = 79kHz (Top) and Fin = 39kHz (bottom) without and with fill-in (left & right
(bottom). respectively).

Figure 3.2.5: Two-tone test all in a non-inverting buffer configuration for un-chopped
(top left), chopped without fill-in (top right) and chopped with fill-in (bottom right).
Power breakdown (bottom left). Figure 3.2.6: Performance summary and comparison with previous works.

DIGEST OF TECHNICAL PAPERS • 57


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.2.7: Die micrograph of the fabricated chip.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.3
3.3 Bandpass Filter and Oscillator ICs with THD < -140dBc at gain-boosted output stage has two stacked transistors, its distortion is lower for single-
10Vppd for Testing High-Resolution ADCs ended peak-peak voltages up to 5.4V.
Figure 3.3.3 shows the opamp architecture and schematic. It is a two-stage opamp with
Subha Sarkar1,2, Rajat Agarwal1, Nagendra Krishnapura2
indirect Miller compensation. gm1 and gm2 form the first and second stages. gm3 is for
1
Texas Instruments, Bangalore, India CMFB. The input voltage of the second stage has a nonlinear relationship to the output
2
Indian Institute of Technology Madras, Chennai, India voltage. Any capacitance at the input of the second stage draws a nonlinear current that
must be driven by the first stage. The resulting input voltage of the first stage causes
The growing demand for high-resolution (18 to 20bit) precision ADCs has increased the
distortion. A buffer is used between the two stages to reduce the capacitive loading on
need for very low THD (< -140dBc) testing hardware that can simultaneously characterize
the first stage and reduce this distortion [1]. gm2 is realized as a Class-AB stage. The
multiple ADCs cost-efficiently in a small form factor. One of the options is to use an
transistor-level schematic of the nMOS side of the output stage is shown in the figure.
active bandpass filter (BPF) to attenuate the harmonics of a medium-accuracy (THD~
Source followers MPB10,11 sense the drain voltages of output transistors MON and drive a
-80dBc) sinusoid from a DAC or a bench-top generator. Another option is to generate a
differential pair with split input transistors MP10 and MP11. MN1,N2,P12 form a replica bias
-140dBc THD sinusoid using an analog oscillator consisting of a BPF in a positive
circuit to set the gate voltage of MP9 such that the drain voltage of the output transistors
feedback loop with amplitude stabilization. In either case, the filter’s distortion must be
is the minimum required to stay in the saturation region. A similar arrangement is used
kept below -140dBc at a 10Vppd output swing, which is the typical full scale of ADCs
for the pMOS side.
operating with a 5V supply. Among active filters, the second-order Tow-Thomas active-
RC topology with opamp-based integrators provides the lowest distortion levels [1]. Figure 3.3.4 shows the complete schematic of the second-order BPF with distortion
Negative feedback with a high loop gain suppresses the components’ distortion in the cancellation based on the idea shown in Fig. 3.3.1. Resistors Rs are used in series with
forward path, but increasing the loop gain does not completely suppress the distortion. C/4 capacitors so that A1 sees a resistive load near its UGB. This improves the stability.
This is because (a) distortion contributed by passive components in the feedback loop Resistors R’ are used to have a well-defined voltage at the intermediate node of the series
is not suppressed by the loop gain or (b) further increase in the global loop gain is capacitors. Capacitors C”/k are used across the feedback resistors kR” to enhance
infeasible because it compromises stability. The distortion-generating components must stability. Therefore, capacitors C” are used in parallel with R” in the driving path to
be identified, and their distortion contribution suppressed using local negative feedback preserve the transfer function of the current injected to the input nodes of A1. To obtain
or cancelled by subtracting the distortion component. In this work, we present techniques 60dB attenuation at the second harmonic, we use a cascade of three stages, each with
to mitigate distortion arising from the two limitations and demonstrate a filter with THD Q=7. A conventional BPF without distortion cancellation is used in the first stage because
< -140dBc for 10Vppd inputs. its distortion will be filtered by the following stages. This reduces area, power, and noise
penalty. Distortion cancellation is used in the second and third stages. The filter’s center
Figure 3.3.1 shows a second-order Tow-Thomas active-RC filter. One-half of the fully
frequency can be programmed to 1kHz and 10kHz by switching the resistor R to 1.6MΩ
differential picture is shown, where vop,om are the BPF outputs (center frequency ωo=1/RC)
or 160kΩ, respectively. In the precision process used for this filter, R and C do not have
and vlp,lm are the lowpass outputs. The most significant contributor to distortion of the
significant process variations. The figure shows the measured magnitude response. An
opamp is the ID-VGS nonlinearity of the output stage. This is suppressed using a
attenuation of 60dB at the second harmonic is demonstrated for both center frequencies.
sufficiently high loop gain. Analysis reveals that the next significant contributors to
distortion are (a) Nonlinearity in the integrating capacitors, and (b) Nonlinearity in the Figure 3.3.5 shows the measured spectra at the output of the BPF having THD< -140dBc
output conductance of the opamps as the peak-to-peak output signal swing approaches driven by a 10Vppd input with THD~ -90dBc. Measurement results from 10 chips show
the supply voltage. The capacitor current is modeled as IC = C(dV/dt)(1+b1V+b2V2). b1 THD< -140dBc. These confirm that the distortion cancellation works robustly. The figure
generates a second harmonic component which is suppressed by the common mode also shows the performance summary and comparison. THD is more than 20dB lower
feedback (CMFB) loop of the opamps. It can be further suppressed by realizing each than in the other works. Because of the distortion-cancellation circuitry, the noise is
capacitor as an anti-parallel combination of half-sized capacitors. b2 generates a third higher than [1]. The signal swing is substantially higher than in the other references.
harmonic component. The nonlinear currents in each capacitor are modeled as additional
Figure 3.3.6 shows an oscillator that is built using positive feedback around the BPF with
parallel sources inl,C1 and inl,C2. Approximate third-harmonic values at each output are
distortion cancellation in Fig. 3.3.4. Amplitude stabilization is added to avoid saturation.
shown in Fig. 3.3.1 when the input vip,im = ±Vpsin(ωot). The third harmonic at the output
For amplitude control, the resistors RQ in Fig. 3.3.4 are made variable by incorporating
of A1 can be cancelled by synthesizing an appropriate nonlinear current isyn and injecting
a MOS transistor in the triode region, as shown in Fig. 3.3.6. RQ in the feedback branch
it into the input nodes of A1 as shown. isyn = inl,C1+inl2 = ((Vp)3b2/6R)cos(3ωot).
is biased with a fixed voltage, and RQ in the input branch is controlled using Vctrl. Though
only one of the branches is controlled, using the MOS transistors in both branches
Figure. 3.3.1 (bottom) shows a way of generating the nonlinear current isyn. It is a
partially cancels the nonlinearity introduced by the MOS transistor. The measured output
feedback amplifier around A3 with two parallel capacitive input branches driven by vom,op
spectra of the oscillator at 1kHz and 10kHz outputs are shown in Fig. 3.3.6. The measured
= ±Vpsin(ωot). Both branches have the same nominal capacitance acC. The first branch
THD at 10Vppd output is -133.8dBc and -111dBc, respectively at these frequencies. The
has a single capacitor and the second has a series-parallel combination of four capacitors.
figure also shows the performance summary and comparison to earlier work. Compared
The linear component of the currents in the two branches cancel. The nonlinear currents
to [1], the distortion is significantly better at 1kHz and comparable at 10kHz at a higher
are unequal in the two branches because the capacitors in the second branch have half
output swing. This oscillator can be used with one or two stages of the BPF in Fig. 3.3.4
the input voltage across them. The nonlinear current is converted to a voltage at the
to generate sinusoidal signals with THD < -140dBc. Figure 3.3.7 shows the die
opamp output. A resistor from the output of A3 to the input node of A1 in the BPF injects
micrograph.
a nonlinear current isyn = k∙(3ac(Vp)3b2/16R)cos(3ωot). The scaling factor ack = 8/9. The
smaller the value of ac, the lesser the additional capacitive load on the BPF and the lesser Acknowledgement:
the area overhead. However, a smaller ac demands a higher k, increasing the noise The authors are grateful to Sureshkumar Ramalingam, Sreeja Chakingal, Sudheer Prasad,
penalty from the cancellation circuit. This design uses ac = 1/4. Ravpreet Singh, Mark Shill, Anand Kannan, Vinay Nadig, Sangeeta Kumar, Rajashekhar
Goroju, Dileep Bhat, Satinder Rai Bansal, Rahul Shetty, Eric Fretheim from Texas
Another significant distortion source at these low THD levels is the output conductance
Instruments for their support.
nonlinearity of the opamp. Since nearly rail-to-rail output swings are desired, it is
common to use an opamp with a dominant-pole frequency response and an output stage References:
with a single transistor between the output node and either supply rail, e.g., a Miller- [1] S. Kumar et al., “Design Considerations for Low-Distortion Filter and Oscillator ICs
compensated Class-AB stage. The large output swing exercises the output conductance for Testing High-Resolution ADCs,” IEEE TCAS-I, vol. 66, no. 9, pp. 3393-3401, Sept.
nonlinearity significantly. This distortion can be reduced by increasing the preceding gain 2019.
but doing so in a dominant-pole amplifier also necessitates an increase in the unity-loop- [2] E. Sackinger and W. Guggenbuhl, “A High-Swing, High-Impedance MOS Cascode
gain frequency (UGB), compromising stability. In this work, an output stage with a Circuit,” IEEE JSSC, vol. 25, no. 1, pp. 289-298, Feb. 1990.
gain-boosted cascode transistor is used [2], as shown in Fig. 3.3.2. The lower transistor [3] A. M. Durham et al., “High-Linearity Continuous-Time Filter in 5-V VLSI CMOS,” IEEE
M2 has a very small swing across it, and its nonlinearity is not exercised significantly. JSSC, vol. 27, no. 9, pp. 1270-1276, Sept. 1992.
The upper transistor has a large swing, but its nonlinearity is suppressed by Ar, the gain [4] Un-Ku Moon and Bang-Sup Song, “Design of a Low-Distortion 22-kHz Fifth-Order
of the opamp used in the gain-boosted loop. A unity-gain differential inverting amplifier Bessel Filter,” IEEE JSSC, vol. 28, no. 12, pp. 1254-1264, Dec. 1993.
used to test the effectiveness of this technique is shown in Fig. 3.3.2. Both the input and [5] S. Wen et al., “A -105dBc THD+N (-114dBc HD2) at 2.8VPP Swing and 120dB DR
feedback branches use identical parallel RC branches. The capacitor nonlinearity does Audio Decoder with Sample-and-Hold Noise Filtering and Poly Resistor Linearization
not cause distortion in this topology. Two amplifiers are built, one with the conventional Schemes,” ISSCC, pp. 294-295, Feb. 2019.
Class-AB output stage and one with gain-boosting for both pMOS and nMOS output [6] Wen, S.H. et al., “A -117dBc THD (-132dBc HD3) and 126dB DR Audio Decoder with
transistors. The figure also shows the HD3 versus output amplitude at 10kHz and HD3 Code-Change-Insensitive RT-DEM Algorithm and Circuit Technique for Relaxing Velocity
versus output frequency for a 10Vppd output. The supply voltage is 5.6V. Though the Saturation Effect of Poly Resistors,” ISSCC, Vol. 65, pp. 482-483, Feb. 2022.

58 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 2:15 PM

Figure 3.3.2: Reducing distortion due to output conductance nonlinearity by using a


Figure 3.3.1: Second-order BPF and capacitor nonlinearity cancellation. regulated-cascode output stage.

Figure 3.3.4: Second-order filter with nonlinearity cancellation (top), Sixth-order


Figure 3.3.3: Opamp architecture and Class-AB output-stage schematic. BPF and measurement gain-magnitude response (bottom).

Figure 3.3.5:, Measured output spectrum and performance summary of the Sixth- Figure 3.3.6: Oscillator schematic, measured output spectrum and performance
order BPF. summary.

DIGEST OF TECHNICAL PAPERS • 59


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.3.7: Die micrograph.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.4
3.4 A 0.01mm2 10MHz RC Frequency Reference with a 1-Point To suppress its 1/f noise and improve the oscillator’s long-term stability, the GM stage
On-Chip-Trimmed Inaccuracy of ±0.28% from -45°C to 125°C is chopped. However, the up-modulated offset will then cause ripples at the control input
of the VCO (VCTRL), and thus increase its output jitter. Conventionally, this problem is
in 0.18μm CMOS solved by increasing the chopping frequency or lowering the GM/CINT ratio [3]. However,
Xiaomeng An*1,2, Sining Pan*1,3, Hui Jiang2, Kofi A. A. Makinwa1 the former leads to a larger residual offset and thus worse inaccuracy over PVT, while
the latter results in a trade-off between capacitor area and jitter performance. In this
Delft University of Technology, Delft, The Netherlands
1
design, the size of CINT is drastically reduced by using a compact switched-capacitor
Silicon Integrated, Eindhoven, The Netherlands, 3Tsinghua University, Beijing, China
2
notch filter to suppress the ripple [7]. The filter consists of two capacitors CMID (=1.8pF)
*Equally-Credited Authors (ECAs) and CHOLD (=2.7pF) and two switches driven by the sample (φS) and hold (φH) signals. As
CMOS frequency references based on RC oscillators  are usually preferred over shown in Fig. 3.4.2 (bottom), the voltage across CINT is effectively sampled once every
bulky crystals in IoT applications [1-5]. However, due to the process spread and finite chopping period, resulting in a ripple-free VCTRL at a chopping frequency of FVCO/8
temperature coefficient (TC) of most on-chip resistors, RC oscillators require trimming (=1.25MHz). Compared to the two-phase filter used in [7], the use of a single-phase filter
results in less ripple due to the absence of mismatched charge injection errors, at the
and temperature compensation to achieve decent accuracy. Enabled by high-resolution
expense of 2× more delay. However, the resulting delay is still quite small compared to
trimming techniques such as ΔΣ [1,2] or pulse-density [3] modulation, recent designs
that of the integrator, and so has a negligible effect on loop stability. The VCO consists
can obtain good accuracy (<0.1%) at the expense of large chip area. However, existing
of a PMOS current source that drives a 3-stage current-starved ring oscillator, while the
compact (<0.02mm2) designs suffer from frequency errors in the order of 1% or more
GM stage is a chopped telescopic amplifier with an 80dB DC gain.
[4,5]. Moreover, their temperature compensation schemes usually require the use of
resistors with complementary TCs, which are not available in all CMOS technologies. The prototype RC frequency reference was fabricated in a standard 0.18μm CMOS
technology, as shown in Fig. 3.4.7. To save area, all the resistors and transistors are
This paper describes a compact RC frequency reference with on-chip circuits with which
placed below the Metal-Insulator-Metal (MIM) capacitors, resulting in a compact
both its TC and absolute frequency f0 can be trimmed. Fabricated in a standard 0.18μm
100μm×100μm layout. Each frequency reference draws 56.7μA (27.5μA analog and
technology, the 0.01mm2 10MHz reference achieves a ±0.28% inaccuracy from −45°C
29.2μA digital) from a 1.5V supply. About 2/3 of the digital power is used to drive the
to 125°C after 1-point trim, which represents the state-of-the-art for designs with a
output buffer. Over a 1.5V to 1.8V range, the frequency reference has a supply sensitivity
similar area. Moreover, the proposed temperature compensation scheme does not
of 2700ppm.
require resistors with complementary TCs, which significantly extends its application
scope. Seven ceramic-packaged chips (112 samples) from one wafer were trimmed and then
characterized in a temperature-controlled oven. Since the intra-batch TC spread turned
Figure 3.4.1 (top-left) shows the block diagram of the proposed RC-based frequency
out to be quite small (±8ppm/°C), a fixed TC trim code (corresponding to the simulated
reference, which is based on a frequency-locked-loop (FLL) [3]. Driven by the output
TT corner) was used for all samples. As expected, the spread in f0 is much larger (±1.9%)
frequency FOUT of a voltage-controlled oscillator (VCO), an RC network outputs a
and was individually trimmed at room temperature (RT, ~25°C).
frequency-dependent voltage (VC), which is compared with a reference voltage (VR)
derived from a resistive divider, integrated and then used to drive the VCO. Due to the As shown in Fig. 3.4.3, the frequency reference achieves an inaccuracy of ±0.28% over
large DC loop gain, the steady-state difference between VR and VC will be near zero, the automotive temperature range from −45°C to 125°C, resulting in a residual TC of
resulting in an output frequency that only depends on the properties of the RC network 31.5ppm/°C (box method). However, significant hysteresis (1500ppm worst-case) is
and the resistive divider. observed as the samples are cycled from hot to cold, mainly due to the instability of the
polysilicon resistors. Each sample was cycled twice, resulting in a cycle-to-cycle variation
On-chip resistors typically have large TCs, up to 1000ppm/°C, while that of on-chip MIM of less than ±200ppm, which is much smaller than the observed hysteresis.
caps (~30ppm/°C) is relatively negligible. As a result, the TC compensation of RC
oscillators is often achieved by combining resistors with complementary TCs to realize Since polysilicon resistors are also known to suffer from drift [8], accelerated aging
a so-called “zero-TC” composite resistor R0, which ensures that VC is temperature experiments were conducted by baking the measured samples at 150°C for one week.
independent [3]. However, this requires a high-resolution resistor-trimming network, As shown in Fig. 3.4.4 (top), both the nominal frequency (0.5%) and its TC (10ppm/°C)
which introduces (trim-code dependent) parasitic capacitances that increase the suffer from drift. However, the former can be trimmed at room temperature with the help
inaccuracy of f0. Furthermore, not all CMOS technologies have a suitable combination of of an external reference [9], while the latter is 3× smaller than the original residual TC,
resistors. and results in much less (0.17%) additional frequency error over life-time. To
characterize the effect of packaging stress, seven plastic-packaged chips were also
In this work, R0 is realized as a single resistor, and TC compensation is achieved by measured. Compared to the ceramic-packaged chips, they required a different TC trim
designing the TC of the reference voltage VR to match that of VC. In the chosen 0.18μm code to achieve similar inaccuracy: ±0.3% from −45°C to 125°C. However, they exhibited
CMOS technology, both R0 and R1 are implemented as p-poly resistors (-0.02%/°C), somewhat less hysteresis (1200ppm worst-case) as they were cycled from hot to cold.
while R2 is implemented by a trimmable combination of p-poly and n-poly (-0.15%/°C)
resistors. As shown in Fig. 3.4.1 (bottom-left), by tuning the width of the two types of Figure 3.4.4 (bottom) shows the start-up behaviour of the frequency reference. After
resistors so that they (nominally) have the same resistance per unit length, the length of setting VCTRL to ground, the output frequency settles within 30μs. Enabling chopping and
R2, and thus its resistance, will be trim-code independent, allowing f0 to be trimmed in notch filtering results in a step-wise settling transient, but does not change the settling
an orthogonal manner. In this work, the TC of the frequency reference can be trimmed time. The frequency reference achieves an output period jitter of 41.4psrms (Fig. 3.4.5,
from -40ppm/°C to 40ppm/°C in 16 steps. As shown in Fig. 3.4.1 (bottom-right), the top) and an Allan deviation of 2.3ppm for a 0.6s-stride (Fig. 3.4.5, bottom). Figure 3.4.6
nominal frequency f0 is trimmed with the help of a coarse-fine capacitive DAC. This summarizes the performance of the RC-based frequency reference and compares it to
results in a trimming range of ±30%, with a worst-case trimming resolution of 0.1%, or the state-of-the-art. Despite the use of a relatively mature 0.18μm technology, it achieves
equivalently 1fF, with a practically realizable 10fF DAC LSB. the best on-chip trimmed inaccuracy among compact (<0.02mm2) CMOS frequency
In [3], a frequency-dependent voltage VC is created by a resistive divider that consists of references, making it a competitive timing solution for low-cost IoT applications.
a fixed resistor and a switched-capacitor resistor. However, the periodic ripple in VC must References:
be limited by a relatively large stabilizing capacitor. In this work, the need for the latter [1] Ç. Gürleyük et al., “A 16 MHz CMOS RC Frequency Reference with ±90 ppm
is obviated by generating VC in three phases, as shown in Fig. 3.4.2 [6]. During the reset Inaccuracy From -45 °C to 85 °C,” IEEE JSSC, vol. 57, no. 8, pp. 2429-2437, Aug. 2022.
phase, φRST, capacitor C0 is pre-charged to VDD, and during the subsequent discharging [2] W. Choi et al., “A 0.9-V 28-MHz Highly Digital CMOS Dual-RC Frequency Reference
phase, φDCHG it is discharged through resistor R0. At steady-state, the duration of this with ±200 ppm Inaccuracy from -40 °C to 85 °C,” IEEE JSSC, vol. 57, no. 8, pp. 2418-
phase is equal to one period TVCO of the VCO’s output frequency, which can be expressed 2428, Aug. 2022.
as TVCO=R0C0ln(1+R1/R2)≈0.7R0C0 and is, ideally, supply-independent. During the third [3] A. Khashaba et al., “A 32-MHz, 34μW Temperature-Compensated RC Oscillator Using
integration phase φINT, a GM-C integrator (GM=5μS, CINT=7pF) integrates the sampled Pulse Density Modulated Resistors,” IEEE JSSC, vol. 57, no. 5, pp. 1470-1479, May
difference between VR and VC, which is then used to drive the VCO. To facilitate the 2022.
generation of the 3-phase control signals, the VCO runs at 40MHz, which is 4× higher [4] J. Wang et al., “A 12.77-MHz 31 ppm/°C On-Chip RC Relaxation Oscillator with Digital
than the targeted output frequency (10MHz). This also reduces the required RC constant Compensation Technique,” IEEE TCAS-I, vol. 63, no. 11, pp. 1816-1824, Nov. 2016.
(R0=36kΩ and C0=1pF) by 4×, and thus the chip area. To improve the FFL’s energy [5] J. Lee et al., “An Ultra-Low-Noise Swing-Boosted Differential Relaxation Oscillator in
efficiency, φINT is 2× longer than φRST or φDCHG. The state of the FLL is thus updated at 0.18-μm CMOS,” IEEE JSSC, vol. 55, no. 9, pp. 2489-2497, Sept. 2020.
FVCO/4. Setting R1=R2=100kΩ results in an even power split between the RC and resistive- [6] A. Khashaba et al., “A 0.0088mm2 Resistor-Based Temperature Sensor Achieving
divider branches. 92fJ·K2 FoM in 65nm CMOS,” ISSCC, pp. 60-61, Feb. 2020.

60 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 2:30 PM

Figure 3.4.1: Block diagram and temperature compensation of an RC frequency


reference (top), TC and frequency trimming (bottom). Figure 3.4.2: System block diagram (top) and timing diagram (bottom).

Figure 3.4.3: Temperature sensitivity and hysteresis of the frequency reference (112 Figure 3.4.4: Averaged frequency of 112 samples before and after aging (top) and
samples). Transient response after VCTRL reset (bottom).

Figure 3.4.6: Performance summary and comparison with previous RC frequency


Figure 3.4.5: Measured period jitter (top) and Allan deviation (bottom). references.

DIGEST OF TECHNICAL PAPERS • 61


ISSCC 2023 PAPER CONTINUATIONS

Additional References:
[7] P. Park et al., “A Thermistor-Based Temperature Sensor for a Real-Time Clock
with ± 2 ppm Frequency Stability,” IEEE JSSC, vol. 50, no. 7, pp. 1571-1580, July
2015.
[8] A. Andrei et al., “Reliability Study of AlTi/TiW, Polysilicon and Ohmic Contacts for
Piezoresistive Pressure Sensors Applications,” IEEE SENSORS, pp. 1125-1128, May
2004.
[9] Microchip Technology Inc., “AN8002 - AVR055: Using a 32kHz XTAL for Run-
Time Calibration of the Internal RC,” [Online]. Available:
https://ww1.microchip.com/downloads/en/Appnotes/doc8002.pdf.
Accessed Sept. 2022.

Figure 3.4.7: Die micrograph and its power breakdown.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.5
3.5 A 1.4μW/MHz 100MHz RC Oscillator with ±1030ppm FOUT=FOUT1= N/(R1C1ln(1/α1)). First-order temperature compensation is performed by
Inaccuracy from -40°C to 85°C After Accelerated Aging modulating the SEL signal using a ΔΣ modulator, resulting in FOUT=(1-β)FOUT0+βFOUT1
where β is the average of the SEL sequence. The optimum β (βOPT) is determined using
for 500 Hours at 125°C a two-point trimming process shown in Fig. 3.5.2. First, SEL=0, the temperature is set
to 85°C, and α0, which forces FOUT=FOUT0=FTAR is determined by using a binary search.
Kyu-Sang Park, Nilanjan Pal, Yongxin Li, Ruhao Xia, Tianyu Wang, Next, SEL=1 and α1 that forces FOUT=FOUT1=FTAR is determined. In the last step, the
Ahmed Abdelrahman, Pavan Kumar Hanumolu temperature is set to -40°C, and βOPT that causes FOUT=FTAR is determined. For instance,
if R0 and R1 have opposite-sign TCs, VERR0 (=VERR of path0) and VERR1 (=VERR of path1)
University of Illinois, Urbana, IL also have opposite signs, and βOPT forces average VERR=(1-βOPT)VERR0+βOPTVERR1=0 only
when FOUT=FTAR.
Monolithic RC oscillators are increasingly becoming the preferred clock source in many
applications, which typically have used bulky crystal or MEMS oscillators. Using novel Figure 3.5.3 shows schematics of critical building blocks. The feedback loop that locks
methods to compensate for the frequency inaccuracy caused by the temperature the main and reference oscillators’ frequencies is depicted in the top left corner of Fig.
coefficient (TC) of the resistors used in the reference RC networks, prior works have 3.5.3. Frequency error is detected by counting the number of main oscillator cycles in
achieved excellent short-term frequency stability [1,2]. While this level of performance 216 reference clock periods and subtracting from the target count denoted by DCNT. The
undoubtedly makes RC oscillators a desirable option even in applications requiring resulting error is accumulated and used to tune the main oscillator’s frequency by
medium-to-high stability clock sources, they cannot be deployed commercially until the adjusting its α0 value. The reference voltages are generated by differential ΔΣ-DACs
performance is guaranteed over their lifetime. Unfortunately, literature on the aging operating at a switching frequency of FOUT/10 (see Fig. 3.5.3). The 17-bit digital input
behavior of RC oscillators is scarce. Given this critical shortcoming, this paper quantifies (DREF0/1) is quantized to 1bit using a second-order ΔΣ modulator and converted to the
aging in RC oscillators and presents methods to overcome it. The results obtained from reference voltages, α0/1VDD and (1-α0/1)VDD using a buffer, an inverter, and third-order
prototype oscillators indicate that aging can cause more than 5000ppm long-term RC lowpass filters (LPFs). Unity gain buffers are connected to VREF during ΦRST=1 to
frequency drift. The proposed compensation techniques reduce it to less than 500ppm. prevent charge sharing between the parasitic capacitor at the integrator input and the
LPF capacitor. The integrator was implemented using a two-stage GM−C topology to
Resistor aging is the most significant source of long-term frequency drift in RC
achieve a high DC loop gain of 126dB needed to suppress VCRO’s temperature and
oscillators. P-poly resistors, typically used as reference resistors because their sheet
supply sensitivity and a low loop bandwidth of ~1kHz to filter the ΔΣ modulator’s
resistance is considerable, are highly susceptible to aging. Accelerated aging tests
quantized error. Recalling RC branches are chopped at ΦCHG, the integrator’s first-stage
conducted on standalone p-poly resistors have shown that the resistivity can change by
more than 0.5% after 1000 hours at 150°C [3]. To evaluate the impact of resistor aging output is de-chopped by the DECHOP signal, removing its offset and flicker noise and
on the oscillator’s frequency, we designed a temperature-compensated frequency-locked- improving temperature stability and Allan deviation. When ΦINT=0, the tail current is
loop (FLL)-based RC oscillator using p-poly resistors and measured its long-term stability steered equally to the folding node by the PMOS switches (MP0~3), preventing integration
by baking at 125°C. As illustrated in Fig. 3.5.1, the measured RC oscillator’s frequency of the input signal. Integration is enabled when ΦINT=1 by turning off MP1/2. The mismatch
drifts more than 5000ppm at 1000 hours, significantly more than the short-term drift, between MP0~3 appears as an additional input-referred integrator offset and is removed
proving that the resistor is the most critical source of aging in FLL-based oscillators. by the chopping/de-chopping operation.

Given the forgoing drawbacks, we present a temperature- and aging-compensated RC A prototype TACO, with the flexibility to use p-poly, n-poly, and via resistors as the
oscillator (TACO) in which the long-term drift of the main oscillator is compensated by reference resistors, was fabricated in a 65nm CMOS process and packaged in a plastic
periodically locking its frequency to that of the less-aged reference oscillator (see Fig. QFN package. The measured aging behavior is summarized in Fig. 3.5.4. The long-term
3.5.1). The main and reference oscillators are identical except that the reference oscillator frequency drift of the always-on n-poly-based TCO is significantly higher than when it is
is heavily duty-cycled to prevent it from aging. TACO’s salient features include (i) duty-cycled (0.1%), indicating that duty-cycling reduces the aging effect. Similar behavior
resistors with higher activation energy (Ea), such as the n-poly or metal type, that have was also observed for VIA-resistor-based TCOs. Given these learnings, R0 and R1 in main
a longer lifetime based on Black’s model [4], (ii) switched dual RC-branches to reduce and reference TCOs were implemented using n-poly and VIA resistors, respectively. After
the stress caused by DC-current-induced electromigration (EM) [5], and (iii) duty-cycling two-point trim at 85°C and -40°C, an accelerated aging test was performed. The results
to slow down the aging rate of the reference oscillator used to calibrate the main oscillator obtained from 11 samples are shown on the right in Fig. 3.5.4. The always-on main TCO
[6]. Thanks to these key innovations, the prototype oscillator achieves better than is compensated by locking its frequency to a 0.1% duty-cycled reference TCO at one-
±1030ppm inaccuracy after it is subjected to accelerated aging, representing more than hour intervals. After 500 hours of aging at 125°C, the frequency inaccuracy of
3.2× improvement compared to the uncompensated oscillator. uncompensated main TCOs degrades by 740ppm. With compensation, the degradation
is only 230ppm, and the inaccuracy over -40°C to 85°C including aging effects is
Figure 3.5.2 shows the details of the proposed architecture. It comprises two RC
±1030ppm. The aging test is also performed when R0 of the main TCOs is implemented
branches (R0C0/R1C1), a GM−C integrator, a voltage-controlled ring oscillator (VCRO), a
using a p-poly resistor, and the results are shown in Fig. 3.5.5. This test provides a fair
divider, differential voltage DACs (VDACs), a phase generator, and a ΔΣ modulator. The
comparison to the state-of-the-art, as a majority of the reported TCOs used p-poly
VCRO clock is divided by N (=25) and fed to the phase generator, which generates clocks
resistors [1,2]. After 500 hours of aging, the inaccuracy of uncompensated main TCOs
ΦCHG, ΦRST, ΦBUF, and ΦINT. These clock phases are used to control the switching
sequence in the RC branch such that the difference between the track-and-held voltage degrades by 3450ppm. With compensation, the degradation is reduced to 410ppm. The
VRC generated from the RC branch and VREF provided by VDAC represents the error measured output period jitter and Allan deviation are 5.1psrms and 8.1ppm, respectively
between the desired and VCRO (FOUT) frequencies. The error voltage (VERR) is integrated (see Fig. 3.5.5). Figure 3.5.6 summarizes the performance of the TACO and compares it
by the integrator and used to drive the VCRO toward the frequency lock. The ΔΣ to state-of-the-art RC oscillators. The proposed TACO achieves a power efficiency of
modulator generates mux-select signal SEL that enables path0 (R0C0 branch/VDAC0) when 1.4μW/MHz and a frequency inaccuracy comparable to the state-of-the-art even in the
SEL=0 and path1 (R1C1 branch/VDAC1) when SEL=1 to perform temperature presence of aging. The die micrograph is shown in Fig. 3.5.7, and the active area is
compensation, as described later. TACO operates in four phases, illustrated by the 0.22mm2.
waveforms depicted in Fig. 3.5.2. Starting with SEL=0 and ΦCHG=0, C0 is reset to VDD Acknowledgement:
during the first phase (ΦRST=1). In the second phase (ΦBUF=1), which lasts for TP=N/FOUT The authors thank Stefano Pietri, John Pigott and Domenico Liberti of NXP and Danielle
duration, C0 discharges to VRC0,DCHG= VDD∙ exp(-TP/R0C0). If FOUT is higher than the desired Griffith of Texas Instruments for useful discussion and critical feedback. This work was
output frequency, then VRC (=VRC0,DCHG)>VREF (=α0VDD), and vice versa. The third phase supported by Semiconductor Research Corporation (SRC) under GRC TASK 2810.036.
provides time for the redistribution of the charge stored in the resistor’s parasitic
capacitor. In the final phase (ΦINT=1), VERR is integrated for TP duration, producing VCRO’s References:
control voltage, VC. In the next cycle with ΦCHG=1, C0 is reset to VSS, allowed to charge [1] K.-S. Park et al., “A Second-Order Temperature Compensated 1μW/MHz 100MHz RC
up for TP, reaching VRC0,CHG=(1-exp(-TP/R0C0))VDD, and VERR (=VRC0,CHG-(1-α0)VDD) is Oscillator with ±140ppm Inaccuracy from -40°C to 95°C,” IEEE CICC, pp. 1-2, Apr. 2021.
integrated for TP duration. Alternating between resetting C0 to VDD and VSS reverses the [2] Ç. Gürleyük et al., “A 16MHz CMOS RC Frequency Reference with ±90ppm Inaccuracy
current direction in R0, reducing EM-induced stress and improving long-term stability from -45°C to 85°C,” IEEE JSSC, pp. 2429-2437, Aug. 2022.
compared to when the current only flows in one direction. In the steady state, the [3] S. Jose et al., “Reliability of Integrated Resistors and the Influence of WLCSP Bake,”
feedback loop forces VRC0,DCHG=α0VDD and VRC0,CHG=(1-α0)VDD, resulting in IEEE IIRW, pp. 69-72, Oct. 2016.
FOUT=FOUT0=N/(R0C0ln(1/α0)). Assuming the TC and aging of C0 and α0 are negligible, the [4] J. Gambino, “BEOL Reliability for More-Than-Moore Devices,” IEEE IPFA, pp. 1-7,
frequency TC is entirely determined by the TC and aging properties of R0. To compensate July 2018.
for R0’s TC, an R1C1 branch with a different TC is added. When SEL=1, path1 is [5] E. I. Cole et al., “OBIC Analysis of Stressed, Thermally-Isolated Polysilicon Resistors,”
selected, and the four-phase operation described earlier ensues, resulting in IEEE IRPS, pp. 234-243, Apr. 1995.

62 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 3:15 PM

Figure 3.5.1: RC oscillator aging with a p-poly resistor, proposed aging calibration
scheme, and FLL-based TCO architecture. Figure 3.5.2: Detailed TCO architecture and two-point trimming scheme.

Figure 3.5.4: Measured frequency drift of a TCO using n-poly resistors when always-
Figure 3.5.3: Details of key building blocks: On-chip aging calibration logic, GM-C on and with 0.1% duty-cycle (left); frequency inaccuracy of main TCOs using n-poly
integrator, and VDAC. and VIA resistors before and after aging (right).

Figure 3.5.5: Measured frequency inaccuracy of main TCOs using p-poly and VIA
resistors before and after aging (left) and output clock performance (right). Figure 3.5.6: Performance summary and comparison to state-of-the-art.

DIGEST OF TECHNICAL PAPERS • 63


ISSCC 2023 PAPER CONTINUATIONS

Additional References:
[6] C. Kendrick et al., “Polysilicon Resistor Stability Under Voltage Stress for Safe-
Operating Area Characterization,” IEEE IRPS, pp. P-RT.4-1 to P-RT.4-5, Mar. 2018.
[7] Y. Ji et al., “A Second-Order Temperature-Compensated On-Chip R-RC Oscillator
Achieving 7.93ppm/°C and 3.3pJ/Hz in -40°C to 125°C Temperature Range,” ISSCC,
pp. 64-65, Feb. 2022.
[8] H. Jiang et al., “A 0.14mm2 16MHz CMOS RC Frequency Reference with a 1-Point
Trimmed Inaccuracy of ±400ppm from -45°C to 85°C,” ISSCC, pp. 436-437, Feb. 2021.
[9] A. Khashaba et al., “A 34µW 32MHz RC Oscillator with ±530ppm Inaccuracy from
-40°C to 85°C and 80ppm/V Supply Sensitivity Enabled by Pulse-Density Modulated
Resistors,” ISSCC, pp. 66-67, Feb. 2020.

Figure 3.5.7: Die micrograph.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.6
3.6 A 12/13.56MHz Crystal Oscillator with Binary-Search-Assisted The DCO consists of 5 stages, with four inverters and one NAND gate to reset or disable
Two-Step Injection Achieving 5.0nJ Startup Energy and the DCO. For the conventional DCO, its first few periods after reset differ from the steady
state due to the complete (dis-)charge of the internal nodes (Fig. 3.6.4, bottom left).
45.8µs Startup Time Herein, we introduce a replica NAND gate (U6) and coupling capacitor CC to aid in its
stabilization. When the DCO leaves the reset state, it induces a transition on Va, which
Haihua Li1, Ka-Meng Lei1, Pui-In Mak1, Rui P. Martins1,2 couples to Vb through CC to accelerate its transition. After this transition, U6 remains
stationary and does not affect the DCO. From simulation, fDCO settles within 500ppm of
1
University of Macau, Macau, China the steady state in the 2nd cycle, and the result is robust over ±15% variation of CC. The
2
Instituto Superior Tecnico/University of Lisboa, Lisbon, Portugal RC-combination acts as the delay cell and defines fDCO (Fig. 3.6.4, top-right). The resistor
R (19kΩ) is a composite resistor to suppress the intrinsic temperature coefficient
Startup time (ts) and energy (ES) of crystal oscillators (XO) determine the efficiency of (simulated: <±1% from −40°C to 85°C), whereas the capacitive part includes C0 and Carray.
ultra-low-power duty-cycled IoT radios. MHz-range XOs take a few milliseconds to reach The one-off programmable C0 sets the nominal fDCO. The Carray controlled by DC is an 8-
the steady state without any startup aid [1-6], limiting the latency and average power of bit binary-weighted array for binary-search calibration. With an 8-bit DCO and tuning
the radios. Injecting energy into the crystal can raise its motional current (iM) for ts and range of ±1%, it requires a unit capacitance of Carray of 0.6fF. Herein, we materialize the
ES reduction, but only if the injection source’s frequency (finj) is close to the crystal’s unit capacitor as a sandwich capacitor (Fig. 3.6.4, bottom right). The top plate of the
series resonant frequency (fs). The two-step injection technique, which calibrates the capacitor connected to R encloses the bottom plate, connected to the switch controlled
injection source using the XO’s signal after the 1st injection, can safeguard finj for the 2nd by Di. This configuration minimizes the adjacent parasitic capacitance in Carray, thereby
injection [2,3]. Unfortunately, the phase-locked loop takes >1,000 cycles to calibrate in increasing the tuning range. In sequence (iv), the system turns off the DCO to save power
a closed-loop fashion, opposing the reduction of ts and ES. and prevent interference. Overall, the simulated DCO power is 54μW, with a jitter of
20.5ps.
Figure 3.6.1 depicts the proposed 12/13.56MHz fast-startup XO incorporating a binary-
search-assisted two-step injection. To obtain fast frequency locking within ±500ppm During injection, the system disconnects the capacitive load (CL) required in the steady
after the 1st injection, we apply a binary-search algorithm to tune the frequency of the state from the XO to avert excess energy loss. Also, we adopt the 4-step stepwise
auxiliary digital-controlled oscillator (DCO). As such the frequency locking only takes 48 injection technique to further reduce the dynamic power due to the parasitic capacitances
cycles, which is >6.8× fewer than [2,3]. Together with the stepwise injection scheme, ts from the I/O pad and PCB traces [4]. The SSG module converts VINJ into 6-phase stepwise
shortens to 45.8μs and ES is only 5.0nJ. The XO dissipates 28.4μW at 0.7V in the steady control sequences (Ø1-6). From simulation, it saves 2.6× energy during excitation.
state and has an FoM of 240.6dBc/Hz.
The fast-startup XO prototyped in 65nm CMOS occupies 0.134mm2 (Fig. 3.6.7). We
The proposed fast-startup XO (Fig. 3.6.2) primarily consists of a XO core, an auxiliary 5- employ three 13.56MHz crystals (size: 3.2×2.5mm2, LM: 61.3mH, RM: 44Ω, CM: 2.2fF, CS:
stage DCO, an edge aligner, a frequency comparator, a control logic for binary-search 1.3pF) for testing. The ts and ES without startup aid are 8ms (swing >90% of steady state)
calibration, and a stepwise signal generator (SSG). The XO startup consists of 4 and 225nJ, respectively (VDD=0.7V). Then, we apply the two-step injection sequence to
sequences: (i) 1st injection; (ii) calibration; (iii) 2nd injection, and (iv) steady state. In (i), expedite the startup. After calibrations, most of the finj are bound within ±500ppm of fs
the DCO injects a signal (VINJ), with finj that is within 10,000ppm of fs, to initiate a small in 49 runs (Fig. 3.6.5, top). Two runs show |Δf|>500ppm that is consistent with our
oscillation across the crystal (VXO≈60mVpp). The inaccuracy of finj without calibration amid prediction. However, fXO still settles within 20ppm of the steady state after excitation. The
VT-variations restricts the discrepancy between finj and fs (Δf). Hence, we set the duration ts and ES are 45.8μs and 5.0nJ, respectively, corresponding to 175× and 45× reductions
of 1st injection (T1) as 4μs; it induces a maximum iM variation of 40% (if Δf=10,000ppm). respectively (Fig. 3.6.5, bottom). Estimated by simulation, 35.7% of ES transduces to the
Nonetheless, the buffer can still capture and quantize this signal to output VBUF. energy of the crystal (EM = LM·iM2/2). The ts varies <±2.3% and <±2.2% (worst case) over
VDD (0.67 to 0.73V) and temperature (−40°C to 85°C) variations, respectively. It
In (ii), the system calibrates the DCO frequency (fDCO) with VBUF for the 2nd injection. consumes 28.4μW in steady state, with a phase noise of −142.5dBc/Hz at 1kHz offset
Without edge alignment, it requires a large 12-bit time-to-digital converter (TDC) with (FoM: 240.6dBc/Hz), indicating that our XO upholds its performance even with a short
<37ps resolution to cover the entire period and produce a final Δf <500ppm. Therefore, ts. The frequency deviation is ≤8 and ≤13ppm, respectively, over such VDD and
we propose a DCO resetter and an edge aligner to align the edges of DCO and XO before temperature ranges. We also obtained consistent performance from the 12MHz XO
frequency comparison. The resetter refreshes the DCO every 3 rising edges once entering (ts: 43.6μs, Es: 4.2nJ, both at 27°C and 0.7V).
into the frequency-comparison mode to synchronize the DCO output (VDCO) with VBUF.
Further, the edge aligner compares and aligns VDCO and VBUF using a bang-bang phase- Figure 3.6.6 benchmarks this XO with the state-of-the-art. Compared with [2,3], this work
detector (BBPD) and a coarse TDC (resolution: 0.2ns) to account for the delay due to achieves >7.0× lower ES and >1.6× fewer startup cycles, thanks to the faster frequency-
the DCO startup time, logic gates, and interconnects (Fig. 3.6.3, top). Digitally controlled locking process (>6.8× fewer cycles). Although the ES is 51% higher than [4], we achieve
delay lines (DCDLs) read this result via the registers, and output the aligned VDCO,D and a 2× steady-state amplitude, which relaxes the design of the ensuing buffer, and a 2.5×
VBUF,D with a PVT-robust phase difference <1ns. Thus, it can relax the fine TDC of the better EM/ES that indicates our scheme is more energy-efficient.
frequency comparator to 7 bits with the same resolution.
Acknowledgement:
The frequency comparator, consisting of a BBPD, a fine TDC (resolution: 23ps), and a The work is funded by The Macau Science and Technology Development Fund - SKL-
digital comparator, then starts the binary-search algorithm. In the 1st bit, the control logic AMSV(UM)-2020-2022, 2023-2025, and the University of Macau (File no.:
sets the 8-bit control word Dc to mid-scale. The BBPD and fine TDC then compare and MYRG2022-00034-IME). Corresponding author: Ka-Meng Lei.
quantify the period discrepancies of VDCO,D and VBUF,D by capturing the differences in their
2nd and 3rd rising edges. Thus, each bit of comparison lasts for 3 cycles. Depending on References:
their polarities, the control logic increases or decreases Dc to alter fDCO. In the next bit, [1] D. Griffith et al., “A 24MHz Crystal Oscillator with Robust Fast Start-Up Using Dithered
the frequency comparator compares VBUF,D to the updated VDCO,D, and the process iterates Injection,” ISSCC, pp. 104-105, Feb. 2016.
until it determines the whole Dc. Sequence (ii) totals 48 cycles, or 3.5μs (TCAL) at a [2] K. M. Megawer et al., “A Fast Startup CMOS Crystal Oscillator Using Two-Step
13.56MHz clock. Theoretically, a 6-bit Dc with tuning range ±1% would be sufficient to Injection,” IEEE JSSC, vol. 54. no. 12, pp. 3257-3268, Sept. 2019.
yield a final Δf<500ppm. However, the jitters from the DCO, XO, TDC, etc. (total [3] J. Jung et al., “A Single-Crystal-Oscillator-Based Clock-Management IC with 18×
jitter=25.1ps) could induce incorrect comparison and distort the Δf (Fig. 3.6.3, bottom). Start-Up Time Reduction and 0.68ppm/ºC Duty-Cycled Machine-Learning-Based RCO
Hence, we chose the 8-bit Dc to improve the average odds of final Δf >500ppm to 4.9%. Calibration,” ISSCC, pp. 58-59, Feb. 2022.
[4] J. B. Lechevallier et al., “Energy Efficient Startup of Crystal Oscillators Using Stepwise
In sequence (iii), the calibrated DCO again injects VINJ into the crystal to further iM. Besides Charging,” IEEE JSSC, vol. 56, no. 8, pp. 2427-2437, Mar. 2021.
Δf, calibrated to <±500ppm in (ii), the phase of VINJ relative to iM in this step is also [5] K.-M. Lei et al., “A Regulation-Free Sub-0.5 V 16/24MHz Crystal Oscillator for Energy-
influential; an out-of-phase VINJ counteracts iM and prolongs ts. As iM leads VBUF by 90°, Harvesting BLE Radios with 14.2nJ Startup Energy and 31.8μW Steady-State Power,”
the system uses VBUF to reset the DCO, and obtains VINJ at the output of U4 in the 5-stage ISSCC, pp. 52-53, Feb. 2018.
DCO to acquire a signal 72° ahead of VBUF (Fig. 3.6.4, top left). As such, the phase [6] B. Verhoef et al., “32MHz Crystal Oscillator with Fast Start-up Using Synchronized
difference between iM and VINJ is 18°, and its impact on the growth of iM is negligible Signal Injection,” ISSCC, pp. 304-305, Feb. 2019.
(overhead <0.1μs). The permissible injection period with a 10% error on iM in (iii) is 37μs.

64 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 3:45 PM

Figure 3.6.1: The proposed fast-startup XO innovates a binary-search-assisted two- Figure 3.6.2: Block diagram of the proposed fast-startup XO and its timing waveform.
step injection. After the 1st injection raises iM in (i), the system calibrates the DCO A state machine counting the clock pulses from the DCO without external control
using the XO’s signal in (ii), and injects the signal with tuned and accurate fINJ again (except ENXO, the enabling of XO) regulates the entire operation, rendering the XO
in (iii) to expedite the process to steady state (iv). autonomous.

Figure 3.6.3: Detailed block diagram of the edge aligner and frequency comparator
with their operation sequences (top). Jitters from each block, simulated probability Figure 3.6.4: Schematic of the DCO with the delay cell (top) and the layout of the
distribution of 6-/8-bit DC with jitter (100k runs), and the average probability of final sandwich capacitor (bottom right). The replica NAND gate U6 and CC facilitate the
Δf>500ppm for different number of bits (bottom). settling of the DCO frequency after each reset (bottom left).

Figure 3.6.5: Measured transient frequency on VPO (top left), the associated error
between fDCO during 2nd injection and fXO of 49 runs (top right), the startup current
profile (bottom left), and ts over the temperature (bottom right). Figure 3.6.6: Performance summary and comparison to the prior art.

DIGEST OF TECHNICAL PAPERS • 65


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.6.7: Die micrograph (left) and simulated breakdown of ES of the 13.56MHz
XO (right), with 35.7% of energy delivered to the crystal core, illustrating the energy
efficiency of the proposed fast-startup XO.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.7
3.7 A 16MHz XO with 17.5μs Startup Time Under 104ppm-ΔF The detailed view of the APEC timing sequence is shown in Fig. 3.7.3. For FINJ <FXO, the
Injection Using Automatic Phase-Error Correction Technique peak of the crystal resonance is that of XOOUT. FINJ accumulates phase slower than FXO,
and the peak position gradually moves from the falling edge of XOIN (Phase[i]) towards
the falling edge of PDIN (Phase[i-1]). BUFFOUT is 1 at the falling edge of PDIN only for Δφ
Zhikuang Cai1, Xin Wang1, Zixuan Wang1, Yunjin Yin1, Wenjing Zhang1, ≥π/4. Hence, the DC samples BUFFOUT at the falling edge of PDIN. Once BUFFOUT is 1, the
Tailong Xu2, Yufeng Guo1 DC will turn EN[i] from 1 to 0 and EN[i-1] from 0 to 1, selecting Phase[i-1] for injection
instead of Phase[i] and correcting Δφ. On the other hand, for FINJ >FXO, FINJ accumulates
1
Nanjing University of Posts and Telecommunications, Nanjing, China phase faster than FXO, and the peak position of the crystal resonance gradually moves
2
Hefei University, Hefei, China from the falling edge of XOIN (Phase[i]) towards the falling edge of PKDEN (Phase[i+1]).
Since the peak of the crystal resonance isn’t that of XOOUT and is difficult to detect in this
The start-up time (TS) of MHz crystal oscillators (XOs) has significantly influenced the case, another criterion is taken based on the fact that the peak of XOOUT stops increasing
power consumption of duty-cycled Internet-of-Things (IoT) systems. The injection once Δφ accumulates to π/4 (Fig. 3.7.1), which means no pulse is generated and PKDOUT
techniques [1-5] have gained popularity for effectively reducing the start-up time and keeps low. The DC samples BUFFOUT at the falling edge of PKDEN. Once BUFFOUT is 0, the
start-up energy (ES) of XOs. Nonetheless, high-efficiency injection is guaranteed only DC will turn EN[i] from 1 to 0 and EN[i+1] from 0 to 1, selecting Phase[i+1] for injection
when the frequency mismatch (ΔF) between the injection frequency (FINJ) and the XO instead of Phase[i] and correcting Δφ.
frequency (FXO) is less than 5000ppm for conventional injection. For best results, ΔF
should be within 2500ppm [1]. As shown in Fig. 3.7.1, for each ΔF, there exists a In this work, conventional double-ended injection will be chosen for its higher efficiency
corresponding maximum motional current (iM), e.g., 65μA for 5000ppm. This is the when ΔF ≤2500ppm, while the proposed APEC technique will be used when
limitation to the energy injection and TS reduction, especially for large ΔF. Since small ΔF>2500ppm due to continuable effective injection. Ti is defined as the injection duration
ΔF is difficult to achieve across PVT with on-chip oscillators, it is necessary to develop between the ith switch and the (i+1)th switch. T0 is larger than Ti (i ≥1) due to the different
XO circuits with high tolerance for large ΔF. In [1], dithering injection can tolerate ΔF of initial condition and the RO start-up duration, while Ti (i ≥1) keeps constant. Hence, the
2×104ppm but is implemented inefficiently with TS >104cycles. Synchronized signal PKD is turned off for energy saving after the 2rd switch as T1 has been stored in the
injection [2] realigns the injection signal with the crystal resonance every fixed time, but integrated memory unit and will be assigned to Ti (i≥2). For further TS reduction, the
every single chip needs to calibrate the cycles of each burst for different ΔF. 2-step conventional double-ended injection is used after the last switch with Δφ small enough
injection [3] employs a phase-locked loop (PLL) to match FINJ with FXO, but it has to inject in a short duration. After sufficient growth of iM, a Gm amplifier is enabled to maintain
with ΔF ≤5000ppm first. In addition, both [2] and [3] have to suspend the injection during the steady-state oscillation.
the start-up, which adds to TS overhead. Impedance-guided chirp injection [4] calibrates
FINJ when chirping inefficiently, which restricts TS reduction. Precisely timed injection [5] The proposed XO was implemented in 40nm CMOS technology. The core area is about
reduces TS by terminating injection precisely at TINJ,OPT which is significantly influenced 0.05mm2 (Fig. 3.7.7). The start-up behavior tested under 1V supply is shown in Fig. 3.7.4.
by ΔF. The inevitable ΔF results in the phase error Δφ (Δφ=Δt∙2π∙FINJ,) where Δt is the This work achieves ΔF-tolerance up to 104ppm and TS of 17.5μs (280 cycles). Compared
time difference between the voltage peak of crystal resonance and the falling edge of with the conventional injection method, this work achieves 249× TS reduction when
injection signal). Δφ will accumulate to a point where injection starts to counteract the ΔF=104ppm. As shown in Fig. 3.7.5, TS varies by 4.5% across -20°C to 85°C and by only
crystal resonance without any correction. This work presents an automatic phase-error 1.27% over a ΔF range of 3000ppm to 104ppm, which corresponds with the theoretical
correction (APEC) technique that can correct Δφ automatically, achieving TS of about analysis (Fig. 3.7.1) and exhibits the robustness of the APEC technique. The performance
18μs with ΔF-tolerance up to 104ppm. is summarized and compared to prior arts in Fig. 3.7.6. The APEC technique succeeds
in injecting energy and correcting Δφ simultaneously, breaking the limitation to the
As shown in the top of Fig. 3.7.1, the envelope of iM increases rapidly and linearly when energy injection and TS reduction for large ΔF. The tightness of implementing on-chip
Δφ≤π/4 while the slope of the iM envelope observably decreases when Δφ≥π/4. The oscillators and trimming FINJ is loosened by injecting efficiently at large ΔF, thus achieving
crystal has two ports, XOIN for injection and XOOUT for detection in this work. XOOUT is a a remarkably enhanced yield.
superposition of the crystal resonance (sine wave) and the coupling of XOIN through the
crystal (square wave). Δφ accumulation results in the peak position shift with respect to Acknowledgement:
the falling edge of the injecting signal (Fig. 3.7.1). Phase[7:0] are provided by a ring This work was supported by the National Key Research and Development Program of
oscillator (RO) as injecting signals, where the phase difference between the adjacent China under Grant 2018YFB2202005.
signals is π/4. APEC acquires the information about Δφ by detecting the crystal resonant
peak positions in real time. Once Δφ=π/4, the injecting signal will be switched to the References:
adjacent signal, correcting Δφ automatically and thus ensuring the linear growth of iM. [1] D. Griffith et al., “A 24MHz Crystal Oscillator with Robust Fast Start-Up Using Dithered
Injection,” ISSCC, pp. 104-105, Feb. 2016.
The block diagram of APEC is shown in Fig. 3.7.2. The RO oscillates around 1.024GHz, [2] B. Verhoef et al., “A 32MHz Crystal Oscillator with Fast Start-up Using Synchronized
a frequency 64× higher than FXO (16MHz). It acts as the reference clock of the peak Signal Injection,” ISSCC, pp. 304-305, Feb. 2019.
detector (PKD). After passing through a divide-by-8 prescaler and a Johnson Counter, it [3] K. M. Megawer et al., “A 54MHz Crystal Oscillator with 30x Start-Up Time Reduction
generates Phase[7:0] (~16MHz) which are sent to a multiplexer (MUX). Phase[i] is Using 2-Step Injection in 65nm CMOS,” ISSCC, pp. 302-303, Feb. 2019.
enabled for injection by the corresponding signal EN[i] from the digital controller (DC). [4] H. Luo et al., “A Fast Startup Crystal Oscillator Using Impedance Guided Chirp
The PKD detects the peak positions of XOOUT in real time so that the peak positions can Injection in 22 nm FinFET CMOS,” IEEE JSSC, vol. 57, no. 3, pp. 688-697, Mar. 2022.
be represented with pulses (PKDOUT). BUFFOUT is generated by PKDOUT which passes [5] H. Esmaeelazdeh et al., “A Quick Startup Technique for High-Q Oscillators Using
through a customized buffer to broaden pulse width, ensuring reliable data capture. A Precisely Timed Energy Injection,” IEEE JSSC, vol. 53, no. 3, pp. 692–702, Mar. 2018.
dynamic comparator (DCMP) with a high refresh rate is used in the PKD to ensure the
accurate tracking of peak positions in this work. The inputs of DCMP are XOOUT of the
current cycle (A) and the peak voltage value of the last cycle (B). During the several initial
cycles, there is no need to switch the injection signal, which provides sufficient time for
B to track A. Because the peak voltage value with the input offset is stored on the
capacitor CPKD, the DCMP has good tolerance for the input offset.

66 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 4:15 PM

Figure 3.7.1: Motional current envelopes of the 16MHz crystal for different ΔF (top
left). Simplified block diagram of the proposed XO (top right). Transient waveform Figure 3.7.2: Block diagram of the proposed XO and details of Johnson Counter and
of XOOUT for different Δφ using APEC technique (bottom). Peak detector.

Figure 3.7.4: Measured transient and frequency behavior at ΔF=104ppm (CL=6pF).


Figure 3.7.3: Timing sequence. FINJ< FXO (top). FINJ> FXO (bottom). Using APEC technique (top). Using conventional injection (bottom).

Figure 3.7.5: Startup measurement results of the conventional injection and this
work. Start-up time over temperature at ΔF=104 ppm (top). Start-up time sensitivity
to ΔF (bottom). Figure 3.7.6: Performance summary and comparison with prior arts [1-5].

DIGEST OF TECHNICAL PAPERS • 67


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.7.7: Environment setup and Die micrograph (top). Energy consumption
(bottom).

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.8
3.8 A 0.954nW 32kHz Crystal Oscillator in 22nm CMOS with provide a bias current for Gm-T and all other gm cells. Since these gm cells operate
Gm-C-Based Current Injection Control in the deep sub-threshold region, their transconductance efficiencies are well fixed at
gm/ID = q/nkBT, and the PTAT current source generating IBIAS = ln(M)·fOSCCR·nkBT/q = I0·T
compensates for gm’s dependency on temperature.
Yihan Zhang1, You You1, Wenjie Ren1, Xinhang Xu1, Linxiao Shen1,
Jiayoon Ru1, Ru Huang1, Le Ye1,2 For oscillation amplitude sensing, we use Gm-DM, an OTA-based gm cell with an engineered
large-signal I-V transfer curve, connected with a large load capacitor to form a CI. The
1
Peking University, Beijing, China construction of this CI is shown in Fig. 3.8.3. First, inside Gm-DM, an asymmetric I-V
2
Advanced Institute of Information Technology of Peking University, Hangzhou, China transfer curve is created by multiplying only one side of the OTA’s current from the
differential input stage by a factor of k = 1.5. Second, outside Gm-DM, an offset voltage
Ultra-low-power (ULP) crystal oscillators (XOs) [1-5] are essential in wirelessly linked VOS is introduced to the differential input, and capacitors CP and CN are connected to the
IoT nodes: for time-keeping purposes, they need to consume low power since they stay differential output for current integration. The following mind experiment shows how
always-on and remain accurate to reduce synchronization guard time [6]. In recent years, this CI now becomes sensitive to the input differential signal’s amplitude: when the
pulse-injection-based XOs (PIXOs) have gained attention for their nW-level low power amplitude is zero, the integral of the output current IOUT = IOP - ION > 0 within each period
with good frequency stability. Typically implemented with a classic pierce oscillator for due to a positive VOS; when the amplitude is infinite, the periodic integral of current now
startup, these designs aim to take over the established parallel oscillation and replenish becomes IOUT = (2NIB – 3NIB)·T/2 = –NIBT/2 < 0. This gives rise to a right amplitude that
only the energy loss in the crystal by injecting short current pulses at the oscillating makes the ∫T IOUT dt = 0 for each VOS. SPICE simulation using periodic-steady-state (PSS)
nodes’ voltage peaks and valleys. They promise lower power consumption since they analysis indicates a roughly linear relationship between VOS and VAMP with a slope of 4.02.
don’t rely on a small-signal negative resistance that requires a high static bias current, Hence, we set the voltage that determines the oscillation amplitude, VAMPSET, to be 4VOS.
seen as the crossbar current in the inverter. Yet to design these PIXOs, the designers A Gm-CM is constructed for CM-sensing using a traditional OTA with duplicated current
must face two challenges, as shown in Fig. 3.8.1. The first is the timing control for the outputs summed into the same CP and CN to compensate for the CM drift. The regulation
injected current pulses. Setting the rising zero-crossing point of the oscillating waveform loop stability is ensured conservatively by sizing CP and CN large enough to serve as the
as the 0° phase reference (or 0 in time), this scheme requires a positive current push at dominant pole.
90° (or T/4) and a negative current pull at 270° (or 3T/4). Misalignments in timing create
an orthogonal current component disrupting the phase, translating noise in the circuit
Fabricated in 22nm CMOS technology, the IC occupies 0.029mm2 (Fig. 3.8.7). Figure
to jitter while reducing the energy efficiency to maintain the same oscillation amplitude.
3.8.4 shows the measurement results of the gm-C-based PIXOs driving ECS-2X6X
The second challenge is oscillation amplitude control. For ultra-low-power operation, the
crystals, including power and frequency deviation under varying temperatures between
oscillation amplitude needs to be low but stable since a linear increase in the oscillation
-20°C and 80°C and changing supply voltages between 0.4V and 0.8V. Averaged over 5
amplitude leads to quadratic power consumption to compensate for the loss in the
tested chips, the design consumes a total power of 0.954nW (worst case: 0.994nW)
crystal.
under a single power supply of 0.46V at 25°C regulated by a temperature chamber.
Measured fOSC drift across temperature is 139ppm average (worst case: 145ppm), and
In this work, we show that both of the above challenges can be elegantly addressed by
this number is dominated by the intrinsic temperature dependency of the crystal: the
gm-C cells: a Gm-T in the clock slicer can generate accurate delay for T/4 injection timing
measured results closely follow the parabolic curve with TM = 20°C (datasheet specifies
based on its reliable 90° phase shift from its small-signal single-pole nature; a Gm-DM and
TM = 25±5°C). With current-biased analog modules making up most of the design, the
a Gm-CM are used as current integrators (CIs) with engineered large-signal I-V transfer
power consumption scales well with temperature, resulting in an average of 1.90nW at
curve to achieve amplitude and common mode (CM) detection and regulation. These
80°C (worst case: 2.19nW). Measured line sensitivity is 22ppm/V (worst case:
three gm-C-like cells form the core of this ULP XO to enable accurate timing of intensity-
regulated pulse injection. With most of the circuit designs using current-biased analog 30.5ppm/V) between 0.4V and 0.8V. The top part of Fig. 3.8.5 shows power and
modules, the power consumption of this design scales well with temperature, leading to frequency deviation under different VAMPSET, confirming the effectiveness of the CI-based
the average power consumption of 1.90nW at 80°C averaged over 5 tested chips. This regulators. Frequency stability is estimated under 0.46V supply at 25°C. The measured
PIXO also shows a 6ppb Allan deviation floor under room temperature, indicating Allan deviation floor is 6ppb.
excellent long-term frequency stability.
Figure 3.8.6 compares this work with previously published ULP XOs. Oscillation
Figure 3.8.1 shows the top-level schematic of the gm-C-based PIXO. The startup circuit amplitude regulation is difficult among prior arts: [2] and [3] rely on a dedicated
is a traditional inverter-based Pierce oscillator generating the initial oscillation necessary secondary supply for the injectors only, complicating the system level design of the
for the PTAT current source. This current is copied and distributed to all gm cells. Once budget-sensitive IoT devices; [1] uses an open-loop approach by simply duty-cycling
the startup circuit gets disconnected, the clock slicer takes the established 180° out-of- current injections for amplitude control, which is transparent to PVT variations. Among
phase oscillation waveform to generate a pair of differential clock signals with a the state-of-the-art ULP XOs in the table, this PIXO is the only one to realize closed-loop
phase-shift slightly less than 90°. Both clocks are fed into a pulse generator, which amplitude regulation. In addition, with the three current-biased gm-C-like cells operating
further directs current injections at 90° and 270°. This differential clock signal is also as the core of the XO, our proposed design enables accurate timing to achieve a high
the output 32kHz clock of the PIXO. On the other side of the circuit, an amplitude frequency stability and state-of-the-art power numbers under a single supply, with the
regulator monitors the oscillation waveform, detects its amplitude and CM value, and power consumption being the least sensitive to temperature variations compared with
controls the injection intensity. Since the same regulated current (through MP and MN) prior arts. Its already sub-nW-level power is the lowest among PIXOs injecting at 32kHz
is injected to either side of the crystal, waveforms on the two terminals become and can be further reduced with sub-harmonic injection techniques used in [1] and [2].
symmetric after the startup transient fades away. All the circuits in this design attempt
to present equal loads to the two terminals of the crystal to preserve this fine symmetry. Acknowledgement:
This work was supported by National Natural Science Foundation of China (No.
Figure 3.8.2 shows the schematic of the clock slicer, the pulse generator, the PTAT 92164301, No. 62104008, and No. 62225401), and the 111 project (No. B18001). The
current source, and the CM bias VCMSET generator. For 90° injection timing, this work corresponding author is Le Ye (yele@pku.edu.cn).
explores single-stage gm-C cells’ small-signal single-pole nature: as long as the intrinsic
gain, gm/gds, of the transistor is high enough, the gm-C cell exhibits a reliable 90° phase References:
shift at the unit-gain frequency (UGF) set by ωUG≈gm/CL. By choosing gm through setting [1] K. -M. Kim et al., “A Sub-nW Single-Supply 32-kHz Sub-Harmonic Pulse Injection
the bias current, this UGF can be pushed beyond ωOSC (fOSC≈32kHz) to amplify the Crystal Oscillator,” IEEE JSSC, vol. 56, no. 6, pp. 1849-1858, 2021.
oscillation signal in the gm-C stage. This amplified swing allows the next stage inverter [2] L. Xu et al., “A 0.51nW 32kHz Crystal Oscillator Achieving 2ppb Allan Deviation Floor
to flip faster and reduces its crossbar power. To generate a timing margin for the pulse Using High-Energy-to-Noise-Ratio Pulse Injection,” ISSCC, pp. 62-63, Feb. 2020.
generator, one can push the dominant pole at ωP≈gds/CL closer to ωOSC, but such an [3] Y. Zeng et al., “A 1.7nW PLL-Assisted Current Injected 32KHz Crystal Oscillator for
approach makes the design increasingly prone to intrinsic gain variation across PVT. IoT,” IEEE Symp. VLSI Circuits, pp. C68-C69, 2017.
Instead, we choose to add two cross-coupled feed-forward capacitors, CF < CL, to create [4] K.-J. Hsiao, “A 1.89nW/0.15V Self-Charged XO for Real-Time Clock Generation,”
a left-half-plane zero at ωZ≈gm/CF. This zero reduces phase-shift at ωOSC but also slightly ISSCC, pp. 298-299, 2014.
down-shifts the UGF. The capacitor CF is drawn directly in the layout as a single layer [5] H. Esmaeelzadeh and S. Pamarti, “A 0.55nW/0.5V 32kHz Crystal Oscillator Based on
MoM with an extracted value of 1.26fF, creating an 83° phase shift at the output of a DC-Only Sustaining Amplifier for IoT,” ISSCC, pp. 300-301, 2019.
Gm-T, or around 0.6μs timing margin for the pulse generator. The oscillation signals are [6] D. Griffith, “Synchronization Clocks for Ultra-Low Power Wireless Networks,” in Ultra-
ac-coupled to the input of Gm-T, with pseudo-resistor RP pushing the highpass corner Low-Power Short-Range Radios. Integrated Circuits and Systems, P. Mercier and A. P.
out of the frequency range of interest. This design uses a PTAT current source [5] to Chandrakasan, Eds., Switzerland, Springer International, pp. 209-231, 2015.

68 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 4:30 PM

Figure 3.8.1: Challenges in pulse-injection XOs and the top-level schematic of this Figure 3.8.2: Schematic of the Gm-C-based clock-slicer (with simulated frequency
work. response) (top), the PTAT current source, and the VCMSET generator (bottom).

Figure 3.8.3: Schematic and simulation results of the amplitude regulator, which Figure 3.8.4: Measured power and frequency deviation across different temperatures
includes two gm-C-cell-based current integrators. (with VDD=0.46V) (left) and supply voltage VDD (with temperature = 25°C) (right).

Figure 3.8.5: Measured power and frequency deviation across different VAMPSET
(top), and measured Allan deviation (bottom). Figure 3.8.6: Comparison with the state-of-the-art low power XOs.

DIGEST OF TECHNICAL PAPERS • 69


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.8.7: Die micrograph, annotated.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / SESSION 3 / AMPLIFIERS AND OSCILLATORS / 3.9
3.9 A 0.5-to-400MHz Programmable BAW Oscillator with Signal Slope, thus eliminating code-dependent delay in the comparator. By creating the
Fractional Output Divider Achieving 4ppm Frequency pedestal-voltage and time-delay using the same current, gain (full time scale)-
dependence on VREF, comparator offset, Io and C1 are eliminated. Since the FOD produces
Stability over Temperature and <95fs Jitter
only a narrow output pulse, a divide-by-two circuit is needed to restore 50% duty-cycle
Subhashish Mukherjee1, Yogesh Darwhekar1, Jayawardan Janardhanan1, at the output. To achieve output frequencies up to 400MHz, two FOD channels are utilized
Peeyoosh Mirajkar1, Raghavendra Reddy1, Harish Ramesh1, Bichoy Bahr2, for separate rising- and falling-edge generation such that the output frequency is doubled
Jagdish Chand1, Uday Meda1, Baher Haroun2, Shankar Karantha1, Ernest Yen3, (Fig. 3.9.1).
Keegan Martin2, Daniel Gan4, Amin Sijelmassi2, Sankaran Aniruddhan5 To achieve 12-bit resolution for the DSDTC, one-time digital correction of gain and INL
error is employed during test. The entire algorithm is implemented on-chip to reduce
1
Texas Instruments, Bangalore, India, 2Texas Instruments, Dallas, TX, test time and complexity. The gain error is estimated by comparing DSDTC output for 0
3
Texas Instruments, Santa Clara, CA, 4Texas Instruments, Melaka, Malaysia, and full DAC code with oscillator edges using a strong-arm latch to an accuracy of <50fs.
5
Indian Institute of Technology Madras, Chennai, India The digital engine changes the MASH denominator thereby adjusting the DSDTC range
Bulk-Acoustic-Wave (BAW)-technology-based oscillators have recently been introduced to correct any gain error. For INL error measurement, the DSDTC is converted into a
that combine a BAW resonator die with a CMOS circuit die containing oscillator and Relaxation Oscillator (RO) such that the RO frequency is a linear function of the DAC
temperature compensation circuits [1]. In this work, we leverage the BAW technology code (Fig. 3.9.3). By measuring the RO frequency versus the DAC code, DSDTC linearity
to create a user programmable oscillator with attractive characteristics such as long- is estimated. A separate INL correction DAC is then employed to correct INL within 2LSB.
term reliability, miniaturization and low cost. Unlike low-frequency crystal/MEMS-based Temperature compensation of the oscillator is achieved by utilizing a Look-Up Table
programmable resonators that require a fractional PLL to generate higher output (LUT), which is filled during test by measuring frequency across select temperatures,
frequencies [4-6], BAW resonators, due to a high resonant frequency (~2.5GHz), allow and is then used for continuous frequency correction. Fast-changing ambient/die
programmability by making use of a Fractional Output Divider (FOD), as developed in temperature can pose 2 challenges for frequency control: large frequency transients and
this work. output phase jumps. The BAW die and circuit die need to be in thermal equilibrium to
The oscillator, housed in a 2.5mm×2.0mm QFN package, is developed by stacking a BAW get the correct temperature measurement, which may not be the case, especially during
die on top of a CMOS circuit die (Fig. 3.9.1). The BAW resonator utilizes dual-Bragg power-up transients. To take care of this, 2 separate temperature sensors have been
acoustic mirrors under and above the resonant body, thus categorized as a Dual-Bragg used: a Molybdenum sensor in the BAW die and a Polysilicon sensor in the circuit die.
Acoustic Resonator (DBAR). This construction effectively reduces the frequency The readouts of these two sensors are blended to provide the overall frequency-sensing
sensitivity to contamination and humidity, allowing a cost-effective non-hermetic code (Fig. 3.9.4). The weights of these sensor inputs are selected based on simulation/
package. The BAW die also contains a Molybdenum (Moly) Resistor used as a characterization, achieving a tight frequency control. Abrupt frequency correction during
temperature sensor element. The DBAR and Moly resistors are in turn encapsulated by large temperature transients can give rise to the issue of unwanted output phase-jumps.
a low-cost Cap made out of Silicon to protect it from package stress. The circuit die is To avoid this, the fast-acting sensor ADC readout is first passed through an IIR filter
divided into 3 functional sections (Fig. 3.9.1): (a) a signal path consisting of a BAW (Fig. 3.9.3) before feeding to the FOD for frequency tracking and correction. The
Oscillator followed by an FOD driving an LVDS/LVPECL/HCSL/LVCMOS multi-mode frequency correction is then brought down to ~1ppb per step by making use of the MASH
driver (b) a Frequency Control section consisting of Temperature Sensor, ADC, and resolution, completely smoothing out the output phase change.
Digital control, and (c) support circuitry including power management, serial interface The circuit die has been implemented in a 65nm CMOS process supporting up to 400MHz
and Non-Volatile Memory. output working from a 1.65V-to-3.6V supply. Current consumption of the DSDTC is
The BAW oscillator core (Fig. 3.9.2) consists of a CMOS cross-coupled pair (M1-M4) measured to be about 14mA and 20mA for output frequencies <200MHz and >200MHz
where transistors M1, M2 are actively degenerated by M5 and M6. Capacitance Cs is respectively. Figure 3.9.4 shows measured output frequency stability of +/-4ppm for a
added resulting in loop gain with highpass characteristics, ensuring feedback stability 1°C/minute temperature ramp from -40°C to +85°C. Figure 3.9.5 shows measured rms
and avoiding the risk of relaxation oscillation at lower frequencies. In contrast to [1], Cs phase jitter for 156.25MHz and 400MHz output achieving a jitter of 86fs and 92.5fs
is implemented as a single-ended capacitance to ground. This allows controlling the respectively (over 12kHz to 20MHz), out of which FOD contributes 63fs. The effectiveness
phase of the second harmonic for Impulse-Sensitivity-Function (ISF) shaping and of INL correction is also shown in Fig. 3.9.5. Performance comparison with prior work
reduced flicker-noise upconversion. Transistors M3 and M4 are also capacitively coupled is shown in Fig. 3.9.6.
through C1 and C2 for the same purpose. A programmable tuning capacitance Ctune is In conclusion, a programmable oscillator has been presented that achieves a tight
connected across the BAW oscillator for overall spur optimization of the final output after frequency control of +/- 4ppm over -40°C to 85°C as well as smooth frequency
frequency division by the FOD. A peak detector is included to ensure reliable oscillation transitions even with fast thermal transients, using a dual temperature-sensor
amplitude. architecture. A dual-slope FOD has been developed that is robust across temperature
The Fractional Output Divider (FOD) dominates overall jitter of the oscillator. The target and voltage variations and achieves <95fs integrated jitter with a 400ps input clock. A
of this block is to take a 2.5GHz BAW oscillator input with 400ps period (range) and generic INL calibration scheme which includes dynamic behavior of the circuits is
synthesize a clock edge with 100fs resolution (12-bit resolution). Aggressive digital presented.
calibration is employed to achieve the overall accuracy of the FOD. The FOD consists of Acknowledgement:
2 modules: a Fractional Divider and a Digital-to-Time converter (DTC). The Fractional The authors sincerely like to thank Apoorva Bhatia, Arpan Thakkar, Rakhav R, Ashish J,
Divider consists of a programmable Multi-Modulus Divider (MMD) whose division ratio Prajwal S, Arshad K, Guru S, Swaminathan Sankaran, Amin Eshraghi, C P Ong, Y F Chek,
is controlled by a 1st-order MASH providing 30-bit (~1ppb) fractional resolution. To cancel Ricky Jackson, Prathap G, Xiaofan Qiu and Michael Perrott for their valuable support in
the deterministic jitter from MMD due to its coarse quantization step (400psec), a Digital- developing this device.
to-Time Converter (DTC) is used to synthesize the fine output edges by phase
interpolation. References:
[1] D. Griffith et al., “An Integrated BAW Oscillator with <±30ppm Frequency Stability
To achieve a wide DTC range of 400ps, employing a variable-slope Phase Interpolator Over Temperature, Package Stress, and Aging Suitable for High-Volume Production,”
can lead to large error due to slope-dependent delay of the comparator and sensitivity to ISSCC, pp. 58-59, Feb. 2020.
temperature/voltage variations, which require continuous background gain/INL [2] Chun-Yu Lin et al., “A 0.008mm2 1.5mW 0.625-to-200MHz Fractional Output Divider
calibration and spur cancellation [2]. To resolve this issue, a constant slope charging with 120fsrms Jitter Based on Replica-DTC-Free Background Calibration”, ISSCC, pp.
can be employed as reported in [3], where an initial voltage is set up using a Voltage 412-413, Feb. 2021.
DAC followed by constant-slope charging to trip the comparator. However, Voltage DAC [3] J. Z. Ru et al., “A High-Linearity Digital-to-Time Converter Technique: Constant-Slope
settling and absolute accuracy requirements can limit the speed and jitter of the DTC. In Charging”, IEEE JSSC, pp. 1412-1423, June 2015.
this work, a new Current-DAC-based Dual-Slope DTC (DSDTC) is implemented (Fig. [4] Renesas Electronics, “XF Plastic Package Family of Low Phase Noise Quartz-based
3.9.3), which makes the entire architecture ratiometric, circumventing the voltage PLL Oscillators”, XF Datasheet, initial release April 3, 2019,
accuracy and settling issues. The architecture is robust across variations of temperature, https://www.renesas.com/us/en/document/dst/xf-family-datasheet
comparator delay, comparator threshold and DAC bias current, thereby eliminating the [5] Skyworks Solutions, “Ultra Series™ Crystal Oscillator. Ultra Low Jitter Any-Frequency
need for power/area-hungry continuous background calibration. The DSDTC works in 2 XO (80 fs), 0.2 to 1500 MHz”, Si545 Data Sheet, Rev 1.0 July 2018,
phases. During Phase 1, Cap C1 is charged with current (1-α)∙Io where α is controlled https://www.skyworksinc.com/-/media/SkyWorks/SL/documents/public/data-
by a DAC. Phase 1 lasts for 1 BAW clock. This sets up a pedestal-voltage in C1 sheets/si545-datasheet.pdf
proportional to the DAC current. Phase 2: Cap C1 is charged with full-scale current Io. [6] SiTime, “1 MHz to 220 MHz Ultra-low Jitter Differential Oscillator”, SiT9366
The charging continues until the comparator trips, generating the code-dependent Datasheet, Rev 1.0 Sept. 6, 2017. https://www.sitime.com/products/lvpecl-lvds-hcsl-
fractional delay edge. The comparator always takes a decision at the same VREF and same oscillators/sit9366

70 • 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE


ISSCC 2023 / February 20, 2023 / 4:45 PM

Figure 3.9.1: Functional block diagram and Oscillator stacked dies. Figure 3.9.2: BAW oscillator core (left), Peak detector (right).

Figure 3.9.3: Fractional Output Divider (FOD) architecture with Relaxation Oscillator Figure 3.9.4: Frequency control scheme and measured frequency change for 1°C/min
loop for INL correction. temperature ramp.

Figure 3.9.5: Measured PN/Jitter for 156.25MHz (top left) and 400MHz outputs (top
right), and measured INL pre- and post-correction (bottom). Figure 3.9.6: Comparison table.

DIGEST OF TECHNICAL PAPERS • 71


ISSCC 2023 PAPER CONTINUATIONS

Figure 3.9.7: Circuit Die micrograph.

• 2023 IEEE International Solid-State Circuits Conference 978-1-6654-9016-0/23/$31.00 ©2023 IEEE

You might also like