You are on page 1of 4

A C ircu it for All Seasons

Behzad Razavi

The Cross-Coupled Pair—Part II

F
Following a general overview of the described here exemplify the utility and travel toward the rails with a “nat-
cross-coupled pair (XCP) in the last of these techniques. ural” time constant [Figure 1(a)]. As-
issue, we begin to study specific suming that X and Y are released
circuit examples incorporating this Sense Amplifiers with an initial imbalance of VXY0
topology. We deal with digital appli- We examine sense amplifiers not ne- (e.g., by means of a switch) and that
cations in this issue. cessarily because we wish to design Vin1 - Vin2 is large enough to ensure
The performance of digital cir- memories but, rather, because the g m1, 2 R L Vin1 - Vin2 . VDD, we can ask,
cuits can be improved if an XCP is techniques studied here prove useful how much time does the circuit take
tied between complementary (dif- in many other applications as well. A to provide a certain gain, G [1]? De-
ferential) signals. Specifically, we common situation in digital (or ana- fining the time-dependent gain as
can add a clocked XCP or replace log) design is that a small initial im- G = VXY (t 1) /VXY0, we have
the PMOS devices in complemen- balance, VXY0, appearing between two
tary logic with an XCP. The circuits differential nodes must be amplified, t 1 = - ln c 1 - G m, (1)
as fast as possible, to (preferably) rail- x1 A1
to-rail complementary signals. The cir-
Digital Object Identifier 10.1109/MSSC.2014.2352532 cuit can be designed such that the two where x 1 = R L C L and A 1 = VXY (3) /
Date of publication: 12 November 2014 nodes are driven by a high impedance VXY0 . g m1, 2 R L (the “dc” gain). On the

VDD

RL RL VX
VDD CL
VX CL
X Y
RL RL
CL CL M3 M4 VY
Vin1 Vin2
X Y
M1 M2
Vin1 VY
M1 M2 Vin2
0 t1 t
0 t1 t CK M5 M6 CK

(a) (b)

2.5

2.0 Stage in (a)


t1 1.5
τ1
1.0 Stage in (b)
0.5

2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0


G
(c)

Figure 1: (a) A simple circuit starting with an initial difference between X and Y, (b) regenerative amplification provided by the XCP, and
(c) required normalized time for obtaining a gain of G.

Authorized licensed use limited to: Synopsys. Downloaded on December 18,2023IEEE SOLID-STATE
at 06:57:48 UTC CIRCUITS
from IEEEMAGAZINE FA L L 2apply.
Xplore. Restrictions 0 14 9
other hand, we can attach a clocked XCP the difference rapidly. (NMOS pull-up Latches
to the differential nodes [Figure 1(b)] and devices are chosen here perhaps to The XCP’s fast amplification property
disable M 1 and M 2 when the XCP turns match the cell’s common-mode level.) also proves useful in latch design. If
on. Now, VXY is amplified regeneratively, It is important to note that the clock- speed is the primary concern, current-
evolving as VXY = VXY0 exp ^t/x regh, ing of sense amplifiers must ensure mode logic (CML) latches with mod-
where x reg = R L C L / ^g m3, 4 R L - 1h . The three distinct time intervals: precharge erate voltage swings (200–300 mV
time necessary to obtain a certain gain, (equalization), bit line sense, and single ended) can be used. Originally
G, is given by [1] amplification. realized using bipolar transistors,
As observed in the waveforms of CML latches have been inherited by
t1 = 1 ln G, (2) Figure 2, the output voltages of the CMOS technology as well. Figure 5
x1 A2 - 1
above sense amplifier cannot return shows an example where M 1 and M 2
where A 2 = g m3, 4 R L . Figure 1(c) plots to VDD in the amplification mode, suf- form a preamplifier and M 3 and M 4
t 1 /x 1 for both cases, assuming A 1 . fering from a degradation in their high an XCP. (The term “latch” sometimes
A 2 . 5. We thus observe the significant levels. Figure 3 depicts another topol- refers to the entire circuit or just the
advantage of regenerative amplification ogy [3] employing both NMOS and XCP plus the load resistors.) When
in sense amplifier design. PMOS cross-coupled pairs and allowing CK is high, the input is amplified
Figure 2 shows a DRAM sense amp- rail-to-rail swings. In this case, X and and impressed at X and Y. When CK
lifier dating back to 1976 [2]. In the Y are precharged to VDD and then con- goes low, the XCP turns on, regenerat-
precharge (or “equalization”) mode, nected to the memory cell. Other vari- ing VX - VY to a final value of I SS R D .
the NMOS devices M 3 and M 4 , respect- ants of these topologies are described This condition is met if g m3, 4 R D 2 1.
ively, pull VX and VY to approximately in numerous papers, e.g., [5]–[7]. Since the output swing, I SS R D, need
VDD - VTH and S 1 shorts X and Y to The problem of MOS device mis- not be as large as VDD, the circuit
remove residual offsets from the previ- matches began to manifest itself in operates faster than topologies pro-
ous cycle (and the threshold mismatch sense amplifiers as higher speeds ducing rail-to-rail swings—albeit at
between M 3 and M 4) . In the evalu- and lower supplies were sought. the cost of static power.
ation mode, these three devices turn Figure 4 shows an early example of Several variants of the CML latch
off, X and Y sense the single-ended offset cancellation within a DRAM have been reported. To reduce the
memory cell level and a reference volt- sense amplifier [8]. voltage headroom consumption, the
age, and the XCP is clocked to amplify In the precharge mode, CK 2 is tail current source can be removed [as
low and the gates of M 1 and M 2 are in Figure 1(b)] while M 5 and M 6 are
pulled to VDD, allowing VA and VB biased in a current-mirror arrangement
to assume values equal to VDD - VTH1 [9]. In this case, the total current flow-
and VDD - VTH2, respectively. The XCP ing through the clocked devices is not
VDD
threshold mismatch is thus stored on constant, leading to “class-AB” oper-
CK2 M4 C 1 and C 2 . In the evaluation mode, ation and improving the speed. The
M3 CK3 first CK 2 falls to zero, initiating posi- CML latch can also incorporate induct-
S1 tive feedback around M 1 and M 2, and ive peaking [10, 11] so as to achieve a
X Y
then CK 3 goes high, accelerating the higher speed [Figure 5(b)]. To ensure
amplification [8]. minimal overshoot and intersymbol
M1 M2

CK1 M5
VDD VDD
S3 S4
CK2 CK2 CK1
Dummy Memory
Cell Cell X Y

VX M1 M2 M1 M2

VY A B
CK1 M5 C1 C2

S3 S4
CK2
Memory CK3
S3 and S4 M5 t
Cell
Turn On Turns On

Figure 2: The sense amplifier reported Figure 3: A sense amplifier with rail-to- Figure 4: A sense amplifier with thresh-
in [2]. rail outputs [3]. old mismatch cancellation [8].

10 IEEE
FA L L 2 0 14licensed
Authorized SOLID-STATE
use CIRCUITS MAGAZINE
limited to: Synopsys. Downloaded on December 18,2023 at 06:57:48 UTC from IEEE Xplore. Restrictions apply.
interference, the damping factor of
the RLC load, g = ^R D /2h C L /L 1 (C L VDD VDD
is the capacitance at X or Y), should RD RD L1
L1
be greater than approximately 0.7.
The static power and the large area X Y RD RD
consumed by this topology are jus- D M1 M2 M3 M4
tified only if ultimate speed (tens of D X Y
gigahertz in 28-nm technology) is CK M5 M6 CK D M1 M2 M3 M4
necessary. D
For operation at lower speeds, ISS CK M5 M6 CK
the latch shown in Figure 6(a) is an
attractive choice. Originating from (a) (b)
the “cascode voltage switch logic”
(CVSL) [12] and its successor [13], the Figure 5: (a) A CML latch and (b) its modified version with inductive peaking and no tail
latch senses and generates rail-to-rail current source.
swings, consuming no static power. If,
for example, X is high and CK rises Answers to Last Issue’s Questions
while D is also high, then the circuit 1) Is negative capacitance the same as VDD
reduces to that shown in Figure 6(b), positive inductance? No, it is not. M3 M4
where the series combination of M 1 The former’s impedance is given Q Q
and M 5 must “overcome” M 3 . As VX by j/ ^C~h and its magnitude falls X Y
falls, M 4 turns on, enabling regenera- with frequency whereas the latter’s
D M1 M2 D
tion around the PMOS loop. impedance, jL~, has a magnitude
The foregoing latch merits two that rises with frequency.
remarks. First, it operates with rati- 2) Can the cancellation of positive ca- CK M5
oed logic, demanding proper siz- pacitance by negative capacitance
ing of the transistors. In practice, if be a resonance effect? No, it cannot.
(a)
W 1, 2 = W 5 = W 3, 4 (and the lengths are In a resonator, the phase difference
equal), the NMOS devices can robustly between the voltage and the cur- VDD
change the state. Second, the com- rent changes sign at the resonance M3
plementary input and output swings frequency. The series or parallel
produce less substrate noise than do combination of a positive capaci- VDD X
single-ended logic families, a useful tance and a negative capacitance
property in mixed-signal design. does not display such a property.
The latch described above can 3) Why is the circuit in Figure 8 a dy- M1
also include logic if the signals are namic latch? If used as an RS latch, VDD
available in complementary form the circuit allows its inputs to be M5
[12]. Shown in Figure 7 is an exam- low simultaneously. Since M 1 and
(b)
ple of a differential CVSL NOR gate M 2 remain off, the leakage currents
embedded in the latch, exhibiting at the drain nodes can corrupt the
Figure 6: (a) A rail-to-rail latch, and (b) a
a smaller input capacitance than state stored by M 3 and M 4 .
simplified circuit during state change.
that of complementary logic. In this 4) In Figure 9, M 1 and M 2 are biased
case, the series combination of M 3 - and balanced by I 1 and I 2 ^I 1 = I 2h .
M 5 must overcome M 7 . At t = 0, I in jumps from zero VDD
to a small positive value, I 0 . We
M6 M7
Questions for the Reader intuitively expect that VX rises and
1) Must we turn off M 1 and M 2 in VY falls. However, viewing the XCP Q Q
Figure 1(b) as we activate the XCP as a resistance equal to - 2/g m, we
even if the circuit is to operate obtain VXY = ^- 2/g mh I 0 u (t), con- M4 B
as an amplifier rather than as a cluding that VX should descend
latch? and VY should ascend! How do we A B A
2) A divide-by-two circuit incorpo- explain the discrepancy between
M1 M3
rates two instances of the latch these two results? M2
shown in Figure 5(b). Can R D be Our intuition in fact assumes that CK M5
reduced to zero in this case? each node bears some capacitance
You can share your thoughts by to ground. The differential equation
e-mailing me. governing the circuit is as follows: Figure 7: A NOR gate embedded in latch.

Authorized licensed use limited to: Synopsys. Downloaded on December 18,2023IEEE SOLID-STATE
at 06:57:48 UTC CIRCUITS
from IEEEMAGAZINE FA L L 2apply.
Xplore. Restrictions 0 14 11
[2] C. N. Ahlquist et al., “A 16K dynamic RAM,”
VDD in ISSCC Dig. Tech. Papers, Feb. 1976,
VDD pp. 128–129.
M3 M4 I1 Iin I2 Iin [3] S. Konishi et al., “A 64Kb CMOS RAM,” in
ISSCC Dig. Tech. Papers, Feb. 1982, pp.
X Y X Y 258–259.
X Y
[4] J. M. Schlageter et al., “A 4K static 5-V
Vin1 M1 M2 Vin2 M1 M2 2
– g RAM,” in ISSCC Dig. Tech. Papers, Feb.
m 1976, pp. 136–137.
[5] T. Watanabe et al., “A battery backup 64K
CMOS RAM with double level aluminum
Figure 8: A differential buffer using an Figure 9: XCP operation from two per- technology,” in ISSCC Dig. Tech. Papers,
XCP. spectives. Feb. 1983, pp. 60–61.
[6] A. Mohsen et al., “An 80ns 64K DRAM,”
in ISSCC Dig. Tech. Papers, Feb. 1983, pp.
102–103.

C L dVXY - g m VXY = 2I 0, (3) can be expressed as a exp ^ bt h + c and [7] K. Sasaki et al., “A 9ns 1Mb CMOS SRAM,”
ISSCC Dig. Tech. Papers, Feb. 1989, pp.
dt substitute for it in (3). It follows that 34–35.
[8] T. Mano et al., “Submicron VLSI memory
circuits,” in ISSCC Dig. Tech. Papers, Feb.
where C L is the total capacitance at each C L ab exp ^ bt h - g m a exp ^ bt h 1983, pp. 234–235.
[9] J. Lee and B. Razavi, “A 40-Gb/s clock
node. We are curious to see whether - g m c = 2I 0 . and data recovery circuit in 0.18-um
this equation’s solution approaches 
(5) CMOS technology,” IEEE J. Solid-State
VXY = ^- 2/g mh I 0 as C L " 0. To solve Circuits, vol. 38, pp. 2181–2190, Dec.
2003.
(3), we would ordinarily write C L dVXY / Taking the derivative of both sides [10] M. Wurzer et al., “42 GHz static frequency
^2I 0 - g m VXYh = dt and integrate both yields ^C L ab 2 - g m abh exp ^ bt h = 0, divider in a Si/SiGe bipolar technology,”
in ISSCC Dig. Tech. Papers, Feb. 1997, pp.
sides, assuming VXY ^t = 0h = 0 and i.e., ab ^C L b - g mh = 0 if b 1 3. This 122–123.
[11] P. Heydari and R. Mohavavelu, “Design of
hence obtaining result points to different possibilities:
ultra high-speed CMOS CML buffers and
1) if C L b - g m = 0, then we can also latches,” in Proc. ISCAS, May 2003, pp.
2I 0 ; m t - 1E . (4)
gm
VXY ^ t h = exp c assume VXY ^t = 0h = 0 and arrive at (4); 208–211.
gm CL [12] L. Heller et al., “Cascode voltage switch
2) if a = 0, then (5) implies that c = logic: A differential CMOS logic family,”
Unfortunately, this result does not - 2I 0 /g m and VXY ^ t h = - 2I 0 /g m; 3) if in ISSCC Dig. Tech. Papers, Feb. 1984, pp.
16–17.
lead to VXY ^ t h = - 2I 0 /g m if C L " 0. b = 0, then (5) suggests that a + c = [13] L. C. Pfennings et al., “Differential split-
level CMOS logic for sub-nanosecond
This is because our solution has tac- - 2I 0 /g m and, since VXY (0) = a + c,
speeds,” IEEE J. Solid-State Circuits, vol.
itly assumed that a) 2I 0 - g m VXY ! 0, we have VXY (t) = a + c = - 2I 0 /g m . 20, pp. 1050–1055, Oct. 1985.
b) C L ! 0, and c) VXY ^t = 0h = 0, all
of which are violated when C L = 0. To References
[1] J. T. Wu, “High-speed analog-to-digital
solve the differential equation without conversion in CMOS VLSI,” Ph.D. disserta-
these presumptions, we assume VXY ^ t h tion, Stanford Univ., 1988. 

12 IEEE
FA L L 2 0 14licensed
Authorized SOLID-STATE
use CIRCUITS MAGAZINE
limited to: Synopsys. Downloaded on December 18,2023 at 06:57:48 UTC from IEEE Xplore. Restrictions apply.

You might also like