Professional Documents
Culture Documents
Reading Assignment:
Weste: Chapter 8
Rabaey: Chapter 11
Note: some of the figures in this slide set are adapted from the slide set
of Digital Integrated Circuits by Rabaey et.al. Copyright UCB 2002
1 EESM5020/19 Lecture 3
A Generic Digital Processor
MEMORY
Input-output
CONTROL
h
e
p
i
c
t
u
r
e
c
a
n
'
t
b
e
d
i
s
p
l
a
y
e
d
.
DATAPATH
2 EESM5020/19 Lecture 3
Full-Adder
A B
C Full Carry
adder
Sum
Sum = ABC + ABC +ABC +ABC
=A+B+C
Carry = MAJ(A,B,C)
=AB + AC + BC = AB + C(A+B)
A Sum
B
C
A
B Carry
C
A
3 B EESM5020/19 Lecture 3
The Ripple-Carry Adder
A B A1 B1 A2 B2 A3 B3
0 0
Co,1 Co,2
Ci,0 Co,0 Co,3
FA FA
The picture can't be displayed.
FA FA
(=Ci,1)
S0 S1 S2 S3
4 EESM5020/19 Lecture 3
Adder
q Fundamental problem is rapidly calculating the carry bit.
q All carry bits are dependent on all previous inputs
– LSB has fanout of N
q Simplest adder: ripper carry – linear adder
q Faster adders: carry look-ahead, carry bypass..
– All of these work out carry several bits at a time
q Even faster adder – logarithmic, tree adder
– Use prefix computation
An-1 Bn-1 A2 B2 A1 B1 A0 B0
C FA ... FA FA FA C0
n
5 Sn-1 S2 S1 S0
EESM5020/19 Lecture 3
Express Sum and Carry as a function of
P, G, D
6 EESM5020/19 Lecture 3
Manchester Carry Chain Adder
The idea: First generate carry-in for each adder bit as fast
as possible and then evaluate the sum. Delay is still
proportional to number of bits but “constant” is small
Two possible implementations: Static & Dynamic
7 EESM5020/19 Lecture 3
Manchester Carry Chain Adder
q The propagate logic in the Manchester carry chain puts a
lot of NFETs in series.
q If Cin is high, and P signals are true, the path is long.
q The max length of the carry chain is limited to around 4.
8 EESM5020/19 Lecture 3
Manchester Carry Chain
VDD
f
P0 P1 P2 P3 P4
Ci,0
G0 G1 G2 G3 G4
9 EESM5020/19 Lecture 3
Sizing Manchester Carry Chain
Discharge Transistor Using Penfield- Rubenstein Model
R1 R2 R3 R4 R5 R6 Out (Elmore delay)delay time tp is
1 2 3 4 5 6
MC M0 M1 M2 M3 M4 0.69(C1R1 + C2(R1 + R2) + C3(R1 + R2 + R3 )
C1 C2 C3 C4 C5 C6
+ C4(R1 + R2 + R3 + R4) + C5(R1 + R2 + R3 +
R4 + R5) + C6(R1 + R2 + R3 + R4 + R5 + R6) )
N
æ i ö Assume 1.2µm technology and
tp = 0.69 å Ci ç å R j÷ minimum size transistor, C is
i = 1 èj = 1 ø assumed to be 20fF and R =
25 400 20KOhm, we have
20 300 t p = 0.69 ´ 21RC
Speed
15 200
scaled by k we have
10 100 Ri +1
Ci = kCi +1, Ri =
51 1.5 2.0 2.5 3.0
01 1.5 2.0 2.5 3.0 k
k
Speed (normalized by 0.69RC)
k
Area (in minimum size devices)
t p = 0.69 RC (1 + 2k + 3k 2 + 4k 3 + 5k 4 + 6k 5 ) / k 5
10 EESM5020/19 Lecture 3
Improving the speed of MCA
q If all propagate signals are true, and CI is high, six series
n-transistors pull the output node low in the case of
dynamic gate while five transistors are in series in the
static gate
q The worst-case propagation time can be improved by
bypassing the four stages if all carry-propagate signals
are true.
P0 P1 P2 P3 BP
Ci,0 Co,3
G0 G1 G2 G3
BP BP = P0P1P2P3
11 EESM5020/19 Lecture 3
Carry-Bypass Adder (cont.)
N
t p = t setup + ( M )tcarry + ( - 1)tbypass + ( M - 1)tcarry + t sum
M
12 EESM5020/19 Lecture 3
Carry Ripple versus Carry Bypass
tp
ripple adder
bypass adder
14 EESM5020/19 Lecture 3
Carry Select Adder: Critical Path
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
15 EESM5020/19 Lecture 3
Linear Carry Select
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
(1)
16 EESM5020/19 Lecture 3
Delay of the linear carry-bypass Adder
17 EESM5020/19 Lecture 3
32-bit carry-select adder
18 EESM5020/19 Lecture 3
Square Root Carry Select
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19
19 EESM5020/19 Lecture 3
Delay of the Square-root carry-bypass Adder
q The idea of the square-root is to equalize the delay of the
carry chain and the select signal generated from the
multiplexer from the previous stage. So one method is to
have more bits and the later stages.
q This simple trick of making the adder stages progressively
longer results in an adder structure with sub-linear delay
characteristics.
q Assume than an N-bit adder contains P stages and the first
stage adds M bits. An additional bit is adder to each
subsequent stage. The following relation then holds:
N = M + ( M + 1) + ( M + 2) + ! + ( M + P - 1)
P ( P - 1) P 2 1
2 = MP + = + P ( M - )
P 2 2 2
q If M <<N then N » and P » 2N
2
q The total delay is given
t add = t setup + Mtcarry + ( )
2 N t mux + t sum
20 EESM5020/19 Lecture 3
Adder Delays - Comparison
50.0
ripple adder
40.0
30.0
tp
linear select
20.0
0.0
0.0 20.0 40.0 60.0
21 N EESM5020/19 Lecture 3
LookAhead - Basic Idea
Ci,0 P0 Ci,1 P1
Ci,N-1 PN-1
...
22 EESM5020/19 Lecture 3
Carry-Lookahead Adders
q The carry delay can be improved by calculating the carries to each
stage in parallel.
q The carry out can be expressed as
Cout = G + P Cin
where G = AB is the Generate and P =A Å B is the
Propagate.
q When G = 1, it ensures that a carry bit will be generated at Cout
independent of Cout and while P = 1 guarantees that an incoming
carry will propagate to Cout
q Now for the carry of the ith stage Ci is
Ci = Gi + Pi C i-1, where Gi = Ai . Bi and Pi = Ai + Bi or Ai Å
Bi
q Expand this yield:
Ci = Gi+PiCi-1 and Si = C i-1 Å Ai Å Bi or C i-1 Å Pi
23 EESM5020/19 Lecture 3
Carry-Lookahead Adders
C2 =
C3 =
C4 =
24 EESM5020/19 Lecture 3
Look-Ahead: Topology
25 EESM5020/19 Lecture 3
Hierarchical Carry Look-ahead
q Problem of CLA:
– Unrolling of carry recurrence can be continued, if
unrolled to level k, resulting in two level AND-OR
structure
• AND Fan-in = k+1, OR Fan-in = k+1
• K+1 transistors in the MOS stack
– Therefore usually limit the size of carry lookahead
• Example: 4 bit
• Still too many stages
q Solution:
– Hierarchy, Block carry look-ahead
– Group carry look-ahead
26 EESM5020/19 Lecture 3
Block Carry Lookahead
S12-15 B12-15 A12-15 S8-11 B8-11 A8-11 S4-7 B4-7 A4-7 S0-3 B0-3 A0-3
GGG1
GCLA
GGP1
28 EESM5020/19 Lecture 3
Logarithmic Look-Ahead Adder(1)
– Tree adder
q Also called Parallel-Prefix Adders (PPA)
q Recursive look-ahead – look-ahead across look-
ahead
q an O(log2N) delay order adder based on basic
property of associative operators
q dot operation(*), a generic associativity property:
(a*b)*c = a*(b*c). Under these conditions, combing N
arguments using the * operator can be executed
with a critical path equal to (log2N)t instead of (N-1)t
29 EESM5020/19 Lecture 3
Logarithmic Look-Ahead Adder(2)
A0 F
A1 A2 A3 A4 A5 A6 A7
A0
tp~ N
A1
A2
A3
F
A4
A5
A6 tp~ log2(N)
A7
30 EESM5020/19 Lecture 3
Logarithmic Look-Ahead Adder(3)
q This property can be applied to the case of an N-bit
adder. Let * operator establish the following
relationship between two tuples (g,p). (A tuple is an
ordered set of values) (g,p) (g1,p1)
(g, p) • (g1, p1 ) = (g 2 , p 2 )
(g, p) • (g1, p1 ) = (g + pg1, pp1 )
(g2,p2)
q The * operator is a function that takes in two sets of
input and produces a set of two outputs. This *
operator is associative but not commutative.
€
q Two extra functions a and b are defined to access
the tuple g = a ( g , p)
p = b ( g , p)
31 EESM5020/19 Lecture 3
Logarithmic Look-Ahead Adder(4)
32 EESM5020/19 Lecture 3
Log (PPA) Adder structure
35 EESM5020/19 Lecture 3
Properties of Brent-Kung Adder
36 EESM5020/19 Lecture 3
Kogge-Stone Tree (PPF) Adder
g15 g14 g13 g12 g11 g10 g9 g8 g7 g6 g5 g4 g3 g2 g1 g0
p15 p14 p13 p12 p11 p10 p9 p8 p7 p6 p5 p4 p3 p2 p1 p0
Cin
37 EESM5020/19 Lecture 3
Industrial examples
q Specifications for high speed microprocessors are very
demanding and cutting edge.
q May not achieve the specified results using only a single
technique: Combine different strategies
RA(63:24) RB(63:24) RA(23:0) RB(23:0) LSB
MSB
40 Bit 24 Bit
Carry Select Adder Differential Carry
cout23 Lookahead Adder
64 Bit Adder
EA(23:0)
EA(63:24)
TLB Data
Cache
Compare
Compare
© Dan Stasiak, IBM Rochester, 2001 real_add(40:0)
38 EESM5020/19 Lecture 3
Industrial Example
q 64-bit adder, on 200MHz DEC Alpha 21064 RISC
microprocessor
q 5ns cycle using 0.75µm technology
q 4 different techniques for this 64-bit adder
– Manchester carry chain is used on the 8-boit level
• Carry chain optimized by tapering down each chain
stage
– Carry-lookahead addition (CLA) was used on the least
significant 32 bits of the adder
– Conditional-sum-addition for the 32 MSB of the adder.
• 6 8-bit select switches used
– Conditional-select adder for the most significant 32 bits
of the 64-bit words
39 EESM5020/19 Lecture 3
Subtractor
+ 1 Sub/add
+
40 EESM5020/19 Lecture 3
Adding multiple inputs
41 EESM5020/19 Lecture 3
Other types of Adder
q Carry-save adder
q Conditional sum adder
q Very Wide Adder using block generate and block
propagate
q (Reference : Weste: Chapter 8)
42 EESM5020/19 Lecture 3