Adv. Radio Sci., 9, 289–295, 2011 www.adv-radio-sci.net/9/289/2011/ doi:10.

5194/ars-9-289-2011 © Author(s) 2011. CC Attribution 3.0 License.

Advances in Radio Science

A modified prefix operator well suited for area-efficient brick-based adder implementations
I. Rust and T. G. Noll Chair of Electrical Engineering and Computer Systems, RWTH Aachen University, 52062 Aachen, Germany

Abstract. The implementation of integrated circuits becomes more and more difficult in the Ultra-Deep-Submicron regime due to sub-wavelength lithography issues. An approach called Brick-Based Design was recently proposed to eliminate the disadvantages of staying with the classical approach to layout design. Prefix adders are a core component in a wide variety of applications due to their high speed and regular topology. In this paper, a modified prefix operator for prefix adders is proposed which is well suited for brick-style layout implementation and, in addition, offers an increase in efficiency. The proposed operator makes it possible to use a mirror gate for the generation of both generate and propagate signals, which exhibits a forbidden input signal combination. This “forbidden state” causes an increase in power dissipation due to transient short circuit currents. The effect of the forbidden state was quantified as part of a comparison against the classical prefix operator, based on 64-bit Sklansky adders implemented in a 40-nm CMOS technology. The effects of the forbidden state were found to be well acceptable. The implementation of the adder based on the proposed prefix operator reduces the area by 29% while increasing the power by 13% compared to one based on the classical operator.

1

Introduction

The continuous reduction of feature sizes in nanometer scale technologies makes high-yield manufacturing at these technology nodes a growing challenge. This is because classical minimum-space design rules can not guarantee the effective usage of Resolution Enhancement Techniques (RETs) that are necessary to tackle the subwavelength lithography problems (Jhaveri et al., 2007). As a Correspondence to: I. Rust (rust@eecs.rwth-aachen.de)

result, the yield is compromised and the circuit performance is reduced due to variability issues (Tong et al., 2006). A straight-forward way to solve this problem is to use Design-For-Manufacturing (DFM or “Recommended”) rules, which allow to keep the classical approach to layout design, but usually cause a high area penalty. As an alternative approach, it is possible to restrict allowed patterns for layout generation to a small subset highly optimized for lithography (“Restricted Patterning”) and, on the other hand, keep or even relax the corresponding layout design rules (“Restricted Design Rules” with SRAM design rules as a well-known example (Jhaveri et al., 2007; Liebmann et al., 2009)). Using this design style, area savings of 15–25% compared to standard cells using DFM rules are expected (Goering, 2007). Figure 1 shows a layout comparison between minimumspacing (MS) rules and DFM rules for an operator cell used in carry prefix adders, the area saving is as high as 35%. Note: Layout pictures do not comply to real design rules to protect proprietary information. In 2005, Kheterpal et al. (Kheterpal et al., 2005) proposed the Brick-Based Design approach, where a small set of logic primitives is mapped on cells called bricks, which are implemented using the Restricted Patterning approach. To ensure high yield manufacturability, the brick layout must comply to the following rules (Jhaveri et al., 2007, 2009; Tong et al., 2006; Liebmann et al., 2009; Kheterpal et al., 2005; Taylor and Pileggi, 2007): – unidirectional use of poly-, diffusion and metal layers, – no jogs in poly layer, – fixed pitch on poly/metal, – high periodicity of layout structures. Fast prefix adders based on the operator proposed by Brent&Kung (Brent and Kung, 1982), layout shown in Fig. 1, are frequently used in a wide variety of applications due to their

Published by Copernicus Publications on behalf of the URSI Landesausschuss in der Bundesrepublik Deutschland e.V.

Noll: A modified prefix operator well suited for area-efficient brick-based adder i implementations propagate signals to generate the sum bits s . These carrys are fed to the post-processing block which uses them together with the propagate signals to generate the sum bits s i .al comn power e effect parison Sklangy. delay is only increased by 6% and area only increased by 15% compared to the standard-cell-based implementation without layout pattern restrictions despite the much more limited layout pattern set allowed in a brick based approach. Results show that by using 8 different bricks. General building blocks of prefix adders..adv-radio-sci. 2... 290 I. de2 Gate for brick implementation lay is only increased by 6% and area only increased by 15% The implementation of carry-generating gates in carry-lookahead adders using Branch-Based Logic (BBL) is a wellknown and frequently used approach (Neve et al. 2002.. 2005). DFM rules minimum spacing rules ai pre−processing prefix tree b i stage 1 H. describ depend the inc eralized Gi:k = high speed and high regularity compared to carry-lookahead adders. The well known formula ci +1 = g i + t i · ci with g i = a i · bi .and n-channel networks are created separately by forming the gate’s pull-up and pull-down equations in a sum-of-product fashion (Masgonty et al. 1991). inherently. 1997). Radio Sci. straight transistor stack without internal branches. B&K prefix operator: MS vs. 6 concludes this work. 2 the mirror gate used for the implementation of the proposed operator is described and the motivation for its usage for brick-based adder implementations is presented. Different variants differ only in the topology of their prefix tree (Patil et al. ai and bi . In this paper. a Kogge-Stone adder was realized with a design-specific brick set and compared to an implementation based on a commercial 90-nm standard cell library. 1. 1991). (2005). All prefix adders consist of a pre-processing block. 9. the topologies for p.. In Kheterpal et al. a complete brick set for the flexible realization of arbitrary prefix adders is developed and implemented. The inverted form of Eq. Results show that by using 8 different bricks. Rust and T. prefix tree and post-processing block (see Fig. every connection from supply to output and from output to ground forms a simple. 4. 2007). (3) where Gi :j and T i :j denote the group carry generate and group carry propagate signals for weight i down to weight j . (1) ci +1 = g i + t i · ci (4) www. rules g i = ai les can nhancehe sub2007). respectively (Zimmermann. A complete brick set for the implementation of arbitrary adders based on the proposed operator is developed in Sect. 1 shows a layout comparison between minimumpute carry bits ci in every weight i. In Sect. so intermediate parasitic capacitances are reduced to a minimum. based on 64-bit Sklansky adder implementations. the area saving is as high as 35%. The acceptoposed sing the al oper- nals for every weight i based on the corresponding input bits are expected (Richard Goering).net/9/289/2011/ . Equation (1) can be generalized to the form Gi :k = Gi :j + T i :j · Gj −1:k . resp equatio ci+1 = describes the computation of the carry in bit position i + 1 depending on the local generate and transmit condition and the incoming carry in bit position i . t i = a i + bi (2) (1) where group c j . As a result.Sche Arm an p. alized Section with a design-specific brick set and compared to an implementation based on a commercial 90-nm standard cell library. a modified prefix operator. G.and ing the produc As a re output withou tances delay a The we ci+1 = A: −35% with er scale e techpost−processing s i+1 Fig. perforg et al. 2). well suited for implementation in a brick-style fashion is proposed. a Kogge-Stone adder was reuated. In BBL gates. DFM rules. The pre-processing block computes generate (g i ). 289–295. Section 3 presents the derivation of the modified prefix operator.. The paper is organized as follows: in Sect. transmit (t i ) and propagate (pi ) signals for every weight i based on the corresponding input bits a i and bi . In the prefix tree these signals are used to compute carry bits ci in every weight i .. adders Fig. This leads to a smaller delay and possibly to a smaller power dissipation. Masgonty et al. 5 a quantitative comparison between the proposed operator and the one from Brent&Kung is presented. Based on the properties of the brick-realization of the operator. 2011 of operator at the presence of variability is evalIn the (V. These carrys are fed to spacing (MS) rules and DFM rules for an operator cell used the post-processing block which uses them together with the in carry prefix adders. In the prefix tree these signals are used to comFig. The effect of the forbidden state Adv. proposed Kheterpal et al.

Noll: A modified well suited for area-efficient brick-based implementations 291 mon is.t = a + b = 0 qed. i pending on the fanout (post layout simulation/slow condii:j i:j that 1982) i:j propagate conditions p (Brent and G Kung. dual expression (7) earlier. respectively. which strongly suggests the 1 the 0 0 i:j 1 0j−1:k i:j i:j ble 1. 4 shows a comparison of the layout implementation of according to to theground. g ˆ and t – ci+1 =G ∀i 1the mirror 0 1operator. tt· with the absence is input indicated by the dashed boxes in operator examples. 1) (1.Mask Mask layout diagrams is derived based on the presented modified prefix operator. To accommodate this fact. 2). vs.T = 0) is replaced by the allowed state i : j i : j I.G to an and for 0000 1111 0000 1111 names0corresponding to use as for an aoi k: k 1 k 11 k k k 0 00 00 11 0 1 00 11 00 11 G j→ k+ 2 0000 1111 0000 1111 ⇒ g = a · b = 1 . dei:k ˜ i:j k ˜ ˜j − 0000 0000 1111 0000000 1111111 +1 g ti:k ti− G T i:transistor 0 1 00 11 00 11 0 1 00 11 00 11 G + T ·T = 0. 0000 1111 0000 1111 G 0 00 11 00 11 0 1 00 11 00 11 i:j i:j 0000 0000 1111 i:11 j = 11111 i:j = 0 in the version corresponding i:k 0 0 1 0 the G 0 a replacement 0 1 00 00 11 0 1 00 11 00 11 G G T . a short current flows i (ˆ g. 1. ratio If ofthe wiring ight i. i Ta T a brick set for the realization of arbitrary prefix adders G0 G Now g i =mirror a i · bi = 1G and = + bi = (5) the area per gate significantly ( − 29% in 40-nm).0 1) (sharing this 0 take 0 0 0 and 1 only i:i0 − 1 values As mentioned earlier. used as a mirror replacement Fig. 1 The 1 to be 1 of 1 gate. = 1 (R. 3 Noll: Athe modified prefix operator suited for area-efficient brick-based adder implementations T = 0. 0 oai gate. Figg that the t signal G pair (G Ti:i−1 . table of the “• ”-operator in Tab. The fact a BBL-style gate (Neve reprei −1 i:j j−1:k j−1:k drick. Ifstage the forRust and T.operator t. T. 1 ˆ) = (g +A: ˆ).t = and 1).L. 2002). Figure 3a shows the for the modified operator. 14 of the layout of The “•”-operator introduced by mirror Brent and Kung (R. 1958) instead of the more comt i . which inherent in the c)topologies. propagate conditions p (R. (c) 7-transistor complex gates. accommodate this fact. 3 c) shows Gnd Gnd Gnd 1 0 1 00 11 00 11 0 1 00 11 00 11 wig and G. the only dif. respectively. : of the layouts (MS rules) rules). To show the effect of supply contact ”. t . nodes the gate arefor located directly Fig. Brent supply nodes adjacent gates. T = 1)i:which regarding carry cell height of i :j = 1. (6) using an or-and-invert gate (oai). cell(s) is j−1:k i:j i:j i:j 0 1 cause 1a 1 conditions 1 ofshows the fact that the Gcomparison G Gimplementation G Fig. Taking a closer look at the mirror gate T G G pair delay of the mirror gate is approximately 20% smaller i:k i:k i:k In modern deep-sub-micron CMOS technologies for which a arked in gray. 0)1 . CMOS In contrast.mon Brent and H. aoi gate. Fig. thethe re-mirror one for the modified operator.g +schematic t·t (8) is indicated by the dashed boxes in the the operator examples.33µm. 5.t · t ). which is inherent in BBL mirror gate version.1111 1111 0000 −29% A comparison using a 90-nm technology shows that theGnd pair 0 11 1 00 0000 00 11 0 00 11 00 11 ferent versions of complex aoi and oai gates. denoted as 1 a modified ci + = g i · (t i +complex ciprefix ) (6) Implemented in a state-of-the-art 40-nm technology.ti ))p-channel means input output pairs in ofthe BBL mir0 1 that 0 1 and 0 i 0 0 (g1 1 1topologies. Mask layout diagrams i i i i where (g = 1. and due to thefor shown Fig.following t ·g ˆ. Kung. a short current 0000000 1111111 plement equation 6G using an or-and-invert gate (oai). 4. Rust and T. This is a prepreprocessing (see Fig.G = 1 in the version corresponding to an aoi gate T 0 1 00 11 00 11 0 1 00 11 00 11 forbidden state (Grespectively. 1982) The first three properties can be adder proved in analogy to “•”. which is not the –hibits “ be 1 1 the0case for 1 the aoi gate. dependfor the absence of additional branches in the nor p-channel G G T oai G G Weinberger and J.G.In the w marked in gray. in To show the effect of supply contact The main disadvantage of the described mirror gate is the s introduced and defined as follows mparing the truth table of the modified operator with other sulting cell pitch of the proposed version is only 0.t . odified “ supply ”. i i from supply Taking closer look = 0.t and i i i i 0 0 isgates 1 0 BBL mirror gate version. Noll: A modified prefix operator suited for area-efficient brick-based adder implementations sharing. Fig. imstate (g Implemented in “ a state-of-the-art 40-nm technology. Fig. 0The 1 fact 0shows gate 0 at any 0 position 0 inside uisite to use the mirror a prefix ure 3c delay-enhanced versions which trade an addiis not present in the signal pairs g i . = 1.T Fig. G generated in the G Modified ⇒ G + T · G −1: = last 1 11 i:j i:j G Gthe inputs contradiction as follows i:j i:j -processing block creates the forbidden state at = diffusion can be ˆ implemented using the T mirror gate by interchanging T ˜ i:j + T ˜i:j · T ˜j −1:k = 0 ˆ) = (g + t (ˆ g .48µm at a for = 1 . respectively nly values (00 . the operator cell contains two mirror gates on top of each ploit this fact. • • : comparison comparison (MS 1 = 1 (?).Gi :j = 0 in the version corresponding to an i− 1 1 is equivalent i i i − 1 i − i : i − 1 i : i − 1 and for T depicting two alternative implementations for an oai and an e is g the value T t in the marked row. 5. 0 1 00 11 00 11 0 1111 1 00 11 11 00 11 i ˜must ˜iassociative ” be – G =1 . aoi gate for Gi Fig. a modified prefix operator.T = 0) the i= j−1:k i:j i:j i=0 = supply G signal pairs generated T i:j 1 − 1:k 11 0000000 1111111 the “•”-operator on the i:j in the T ˜ i:j + ˜ ˜ jcombination 0 00 11 00 0 1 00 11 00 11 den flows from i:j G T ·G = 1 is applied. 3c an aoi or an oai gate.G = 1 in the version corresponding to an aoi gate uth table of in corresponding Tab. 2011 i:j i:j i:k j−1:k i:j i:j i:j j−1:k i:j 00 11 i:j j−1:k i:k Gproperties :k T T. Fig.0 1) (sharing this 0 take 0 the 0 0 0 and 0 1 0 1 tional transistor for the absence of additional branches in the e. a short flows from i:j A input comparison using is a 90-nm technology shows that the erator cell which is implemented as efficiently as current possible. identities g i = Ga and ti = T i:iat ). denoted as is derived on the presented modified prefix operator. i:i ence is theg value of T theutilizing marked row. • • comparison defined as 1 ” must 1 following 0 associative 0 1in a symmetrical 0 at the cell properties boundary fashion.t =1 1) . 1). Brent and H. ·g ˆGate . defined as follows a mirror should be kept at a minimum.g t + t ):. (5) motivates usage of transmit G because G the G at conditions oai be kept in atto aaminimum. it is obvious that i:j den combination applied. using minimum-sized transistors. Normally it forbidden would be necessary to gate. 3). preventing the proves are omitted. −29% ’ ’ This 1 makes it possible to share the (g. cells are cascaded horizontally. 2). the seven-transistor Tdelay T modern deep-sub-micron technologies for which a preprocessing block (neglecting effects for now). Noll: A Modified Prefix Operator Well Suited Brick-Based Adder Implementations equation 5 motivates the usage of transmit conditions ti (A. = 1. 0 Fig. deG T G T G brick-based design style t and Smith. because the forbidden inputGstate k kG k (seeG branches complex gates.well Figure shows transistor schematics. the rec) 7-transistor gates g .cells i : k i : k ˜ ˜= ˜ =˜0 = contact j −1 1: j −1:k G .. 289–295. t with (7) of the input G The main disadvantage of the described mirror gate is the aoi mirror gate he operator (Tab. Brent 1. Normally it appears would be at necessary to imhe forbidden state . 3 b) and Fig.t i . T ing cell rows are not only very dense. 0000 1111 0000 1111 i:j i:j j−1:k i:j j−1:k 0000000 1111111 oai picting two G alternative implementations for an oai aoi 0 00 11 00 11 0 00 11 00 11 G G and an Vdd 1 Vdd Vdd 0000 0000 1111 1: k i:j k1 j − k 1111 i:1: jk 0 1 00 11 00 11 0 1 00 11 00 11 ˜ j −k ˜ 0 0 0 0 iT 0 T +1 +1 i :i−1 :i−10 0000 1111 0000 1111 ⇒ G = 1 . the bit slice width is irrelevant for the process of carry generation due supply to ground. 3 c) depict dif0000 1111 0000 1111 (g a) (c) 0 00 00 11 0 1 00 11 00 g t t G (b) T 0000 1111 0000 ⇒ g k+1 + tk+1 1 · g k11 = 1. G. which Fig.t ˆ ·t (7) i+1 i ˜ 1 1 1 1 1 1 ∀i the 5 shows the – fact “= ”G must be idempotent 4Fig.t i and g i . T :i−mirror ) can gate with oai gate. a) 0) shows the mirror gatei:jwith signal g +t 1 ·t = 0 . 1tion 1 the six-transistor and Kung.k T i : i − 1 i : i − 1 existence of a so-called forbidden input state.to an g a closer look at Gi:i−1 and T i:i−1 . comparison: 3 Modified prefix operator t) (ˆ g. w the (0 . 1). it is always possible to use the11 mirror 0 1 0 aoi and0oai gates.p i ⊕ bi = tions). two operator earlier. The resultlayout of the remaining cells to the layout of an op1 1 0 1 i:j 1 i:k 1 i:k i:k aoi adaption G G j−1:k G j−1:k is derived based on the presented modified prefix operator. (0. 1982). t fact gofthat G pair Ti:i− 1 ˜ iwhich 1 ˜ neration. Be1 be implemented 1 1 1 an and-or-invert 1 4 It Derivation of brick set i:i the operator is evident that the implementation of according to the identities g i = G and ti = T i:i ).48µm at a (eq. s” introduced and defined as follows 5 showsbased the must be idempotent 4Fig. topologies. the ratio of wiring i−1 = 0 . The fact represented in 0 1 00 11 00 11 0 1 00 11 00 11 delay of the mirror gate is approximately 20% smaller comdelay-enhanced versions which trade an additional transistor I. T = 0 i:k = 0 . utilizing the absence g i and g i i with ti . happens for following operator cells (Tab. Derivation ofcomparison brick set of “ ” implemented using mirthat carry is already generated because of Fig. aring the truth table of the modified operator with plement Eq. which Moreover. www. 1982) for any Boolean variables g . b) c) due other as shown in Fig. The last property can be proved by efined these I. i:i−1 i:i−1 sulting cell pitch of the proposed version is only 0. :: comparison of layouts (MS (MS rules) rules).33µm. Gate schematic comparison: 4 Derivation of brick set 0 00 1 0the1absence 1 1 of additional 0 1 0 1 in the n. 5. which strongly suggests the ants of aoi and oai can also share supply contacts over cell 1. thestate. 1958) instead of the more coming on the fanout (post layout simulation/slow conditions). denoted the as i = 1. ˆ) = (gFig. connected only to those signal pairs (wired to the operators i i i i ˆ) = (g + ˆ bidden input combination isthe applied. Radio Sci. = 1 (R. (g = a 0 ) is possible. a imodified prefix operator.adv-radio-sci. (Eq. The the signal (G . 0 10 1 1 1 0 0 g . 0 1 00 11 00 11 0 1 00 11 00 11 0000 1111 0000 1111 1 gate. is introduced andgate. 1) (1 . for Area-Efficient j−1:k i:j i:j i:j j−1:k i:j pared to aTcascade of 6-transistor aoi and gates. 1. 0 This both do contain the forbidden 1ror 0 version. is always possible to use mirror sharing. reducing crucial for the adderbetween efficiency. T = 0 must never occur in combination a nand-gate producing T i . ˆ). G. Rust. j−1:k i:j :i− i:i−1 Ti:j mirror gate which channel part. as Rust. I.net/9/289/2011/ ror and “•” using seven-transistor Adv. layout diagrams. 1982). T =0 00= metal 11 i +1 i i i 00 11 i:k i:k i:k i:k · (t (6) = g on i:j+ c ) Gc G pairs j−1:k G j−1:k these ned as “•”-operator proves The property can be proved by 00 aoi :j are i:j ˜ j Prefix k Operator Tthe signal ng the G ˜ i3 ˜omitted. because forbidden input state a) mirror gate b) 6-transistor complex gates 4. i:j i:j is advantageous.p = a ⊕ b = 0) is possible.or p-channel j−1:k i:j i:j i:j j−1:k i:j preprocessing Fig. symmetric also is capacitance to gate capacitance is ever increasing (VeenG 1 andaT T king a closer look at Giyielding . 9. it is j = 1. to be used as a replacement for 0 1 0 not 1gate 1 0 0 1 state.ti as shown in Fig. g 1i both not contain the forbidden state. T i − 1 i − 1 i : i − 1 i : i − 1 0 t i do 0depict 1 1 0 0 b. using minimum-sized transistors.of the operator cell contains twoin mirror gatescorresponding on top of each exploit this fact. 5) and the inputs of all prefix operators in this are can be implemented using the mirror gate by interchanging existence ofcells a so-called forbidden input state. t + t 3. 2). respectively is not present the signal pairs gi . and Fig. ⇒ 0 1 00 11 00 11 0 1 00 11 00 11 G j→ k+ 2 : input 0000000 1111111 0 1 00 11 00 11 0 1 00 11 00 11 i:j j−1:k j−1:k 0000 1111 0000 1111 ocessingiblock creates the forbidden state at the inputs 0000000 1111111 supply to ground. Kung. 1982) is. for any Boolean variables g .well cell height of 1. ’ ( ’g. 3 a).tT =the ak signal + bk = 0 qed.t ) ) means that input and output pairs of 0 0 0 1 0 0 i i gate in the first stage of the prefix graph directly after the 3b and Fig. because the forbidden input state different versions of complex aoi and oai gates. the only dif. 5) and the inputs of all prefix operators in this stage are ty0with . (b) 6-transistor Fig. 5. Brent and Kung. Taking a closer look at the mirror gate T G G 0 1 00 11 00 11 0 1 00 11 00 11 i −shows 1 1 corresponding i:i−1 i−1 i:k i:j k+1 1:k k 1111 Fig. Figure 1 ˜i:i−1 = 1) which is equivalent regarding carry = 1. but none would be transmitted from weight i − 1. so d H. exhibits the following properties 1 the 0 0 0 0 ite0to 1 use mirror gate at any position inside a prefix 4 shows a comparison of the layout implementaFigure 1 0 1 1 1 ’ ”-operator ’ A: −29% by Brent ’ and ’ Kung (Brent The “ introduced 0 11 ˜ 1 1 gate 0 and 1 1 i 0 of variant of the aoi ˆ. but also highly regular G G 3 Modified Prefix Operator 1 1 can never 1 1 simultaneously 1 i:j 1 at the outputs erator cell which is implemented as efficiently as possible.5. So the forbidden state Application of “ • ”-operator to output of preprocessing. Application of “ • ”-operator to output of preprocessing. it becomes clear that this G G G 0000 1111 0000 1111 0000000 1111111 ⇒ g + t · g = 1 0 1 00 11 00 11 0 1 00 11 00 11 shown in Fig. 7-transistor complex Now a brick set for the realization of arbitrary adders 0 11 1 0 0 0 1 0 1 0 1 (eq. 5.Gi:j = 0 G obvious G capacitance T i: the version and 4.t) ˆ t t · g. connected only to those signal pairs (wired to the operators 0 1 1 can 0 1 0 1 using gate (aoi). 00 11 contradiction as follows j−1:k i:j i:j i:j 00= poly 11 G G G G 00 11 the ˆ) = (gMoreover. 2002. G. Hellner. Derivation ofcomparison brick set of “ ” implemented using miract that the carry is already generated because of i . Therefore.5) 4. 0). 3. 0 1 00 11 00 11 0 1 00 11 00 11 i 0 i 0 i −1oai i−1 1 i:i−1 i :i − 1 Area: or an respectively. i:i−1 i− 1 “•”-operator ˜i:the 1. g i 3 the schematics. 1 Application of “•”-operator to output of preprocessing j−1:k gate andt ithe six-transistor variant of the aoi gate. the dual expression gate in the first stage of the prefix graph directly after the shareable supply contacts of the mirror gate depicted in Fig. which truth I. but not without poly jogs. where to for gate capacitance is ever increasing (Veen. 3= appears at the gate. two operator are cascaded horizontally.T = 0) is replaced by the allowed state perty with ( g . t +t·g ˆ.i−1 ginand (8) of the . 0 is a preaccording to the identities g i = Gi :i and t i = T i :i ). and the inputs of all prefix operators prefix in this stage are Fig. 1 inherent nor which is connected only to those signal pairs (wired to the operators e modified operator. respectively is not in pairs g . prefix Layouts are Now a brick set with for the realization of arbitrary adders exploit this fact.. 1982) is. g ˜ ˜iT 00 11 three can beG proved in analogy to “•”.t · t). “ ”. it of becomes clear aoi that this happens the fact that the carry isi:j already generated because of j−1:k should i:j j−1:k i:j i (Weinberger compared cascade 6-transistor and oai gates. it becomes clear that this happens gate shown in Fig. Kung. t twith ·g ˆ. ˆ 00 11 t) • (ˆ g. preprocessing (see Fig. If the forbid⇒ G = 1 . i the prefix Rust and T. 3 c) shows As mentioned 0000 1111 0000 1111 −29% 0 00 11 00 11 0 00 11 00 ferent 0 versions0of complex Gnd 1 Gnd Gnd 1 Gnd 0 1 00 00 11 0 1 00 11 00 11 0 00 1 0 0 0 1 0 0 0 1 k+1 gate k+1 k11 in the first stage of the prefix graph directly after the 0 1 00 11 00 11 0 1 00 11 00 11 g + t · t = 0 . vs. the bit slice width is irrelevant a) for the process of carry generation oaiTo gate.G. vs. existence of a so-called forbidden input forbidlayout adaption of the remaining cells to the layout of an opboundaries. T G a)for mirror gate b) 6-transistor complex gates ⇒ gk = a · b present = 1. 3 a). operator.t i and i . 3. T ˜ ˜i:i−1 ) can ation.G = 1 in the version corresponding to an aoi gate T shareable supply contacts of the mirror gate depicted in Fig. for any variables g. but none would be transmitted from weight i − 1 . 2000). preventing usage drick. Application of “ • ”-operator to output of preprocessing n-channel topology of the aoi gate can also be used in the pbrick-based design style is advantageous. A Modified Prefix Operator Well Suited for Area-Efficient Brick-Based Adder Implementations usageNoll: of a mirror gate. (0 . 3a. gates vs. Kung. So the state the mirror gate the six-transistor variant of the aoi The • ”-operator introduced by Brent and Kung (R. Smith. hence. Fig. it is always possible to use the mirror signal names corresponding to the use as a replacement for = 1. T. 4. 2000). G et al. T = 0 state (( gG . t ˆ and Gj−1:k The firstG G Boolean G G T G so G = 1. because a carry would be generated at The main disadvantage of the described mirror gate is the crucial for the adder efficiency. a carry would be generated sented in Eq. G3). delay-enhanced versions which trade an additional transistor Fig. g ˆH. Gate schematic comparison: (a) mirror gate. . happen of the adders i:j G G that the implementation of the operator cell(s) is varii:j i:j It is evident without any poly jog. and due to the the fact that the carry is already generated because of As mentioned it Therefore.

. This is· T a pre. sharing. a modified operator. “ ”. implementation. the only difthe truth table of the “ in Tab. and due to shareable supithe i . i: j i:j ˜ j −using 1:k Fig. which strongly suggests thecellsprefix ”. Therefore. Kung. The last property be proved by is one for the modified operator. G.˜ = 1. the only dif˜using ˜ i :operator a modified prefix denoted as which a introduced brick-based design style is advantageous. (G requisite = 1 . by the dashed boxes in the operator examples. Taking a contains closer look at G and T on .33µm. To show the effect of supply contact – c = G ∀ i ⇒ g = a · b = 1 . which ference is the value T value in the marked row. using minimum-sized tree. which strongly suggests ⇒ g = a the ·to b the = 1layout . 5.48 µm at a cell height of 1. which G ˆ) = ˆ).48 µm at a cell height of the layout adaption ofcarry the remaining cells to the layout an drick. 4. because a carry would be generated at crucial sharing this to the fact that the carry is already generated because other as shown in Fig. in aspacing state-of-the-art In0 modern deep-sub-micron CMOS technologies for Implemented more limited layout features available in the brick approach. Implemented the shareable supply contacts of the mirror gate depicted in 1. So the forbidden denoted as tion based on DFM rules (Fig. Noll: Aset modified prefix operator well suited for area-efficient brick-based adder implementations h table of truth the of modified operator with 4 Derivation of brick which design style is a advantageous. 1982).seven-transistor the area saving is as haring this The ply contacts of the mirror gate depicted into Fig. Advantages was reported in ? .g ˆ version + tadder ·t 1. it is obvious that his is aTaking pre. which strongly suggests of wiring capacitance to gate capacitance is ever increasing w only take the values (0. 1) (sharing this 0 0 0 property with ( g . Noll: A 1. 5 shows the comparison of ”gate implemented using 4 Derivation of brick set is derived based on theof presented modified crucial for the adder efficiency. The last property can of be each proved by as are shown for minimum spacing (MS) rules. is introduced and defined as follows at G and T . T = 0 The fact that the G can the “ •”-operator the signal generated in) the ˜ i:j + T ˜i:j · G ˜ j −1:k = 1 g i+ t tgeneration. whichof is indicated ⇒ gk = a ·b =the 1.t i i :i −1input i:i− 1 capacitance gate is appears ever increasing (Veenbvious that i ator cells (Tab. thetwo bit mirror slice prefix width should i: i − 1 i− 1) ing operator. for G in combination with a nand-gate producing T i . t ipairs ofgates cell two mirror gates of each other as shown ror and “•” using the seven-transistor aoi gate for G remaining :i−1 It is evident that the implementation of the operator cell(s) layout adaption of the cells to the layout of an op˜ erator cell i :i − 1 iscontains T ) igeneration k+ 1 which k +1 kis implemented as efficiently as possible. (?with ).33µm. should be kept at a minimum. So the forbidden state to the fact that the is already generated because of which a brick-based design style is advantageous. the operator contains two mirror gates on top usof (8) g t g ttechnology. 5. • comparison rules) ) means that input and output pairs of 1 Figure 0 0 1 layouts 1(MS 0 0 This is Tab. bk = 0 qed. the only dif) is derived based on the presented modified prefix operator. follows Figure ?? shows the comparison of “ ” implemented usIt is evident that the implementation of the operator cell(s) g +t · t = 0. vs. the only difoperator cell which is implemented as efficiently as possible. 1. two ˜ i :i −1 = ˜ i :i −1 = (G 1 . gray. be – used as a for occur and due to the shareable i :i − was reported “ •to ” exhibits the following properties ˜ = ˜replacement G 1. www. This is a in Fig.= 1. Advansharing. that a Fig. (0 . 1 0 0 0 0 0 11 1 01 0 k k 1 the1adder 1 the ’0•qed.adv-radio-sci. using minimum-sized cement for ˜is ˜j ⇒ G = 1 . which is table equivalent carry is derived based on thecontains presented modified operator.t ) • (ˆ g . T only ) can Fig. ˆ t ) + tFig. the following additional must never occur in combination with a nand-gate producing T . Lay0 0 0 1 0 1 0 1 saving of 29% clearly over-compensates the area increase irror gate at any position inside a prefix 0. is derived based on the presented modified prefix operator. 1 both do not contain the forbidden state. • : comparison (MS rules) h k1 +1 k +1 (G = 1 . the show effect of supply contact sharing. ” implemented usig. 4 vs. = 0 must never occur inseven-transistor combination agate nand-gate producing T i . “ the ”. Tocell show the effect of supply contacttechnology. ? ).0 1) (sharing this 0 0 00with 0 0 property (0 g . T ˜j −1:k = 0 0 not 1 1 set 0 0+ Tab. The fact that the signal ( G . :’comparison comparison of of layouts layouts (MSrules) rules). carry which to ( i capacitance gate capacitance is ever (Veending 1 1 1 i :1 i− 1this ? ). To the effect of supply contact “ ”. the operator iA: i . t ) = ( g + t · g ˆ .= 0with 0 0 1 to 0 capacitance 0 i . Noll: A Modified Prefix Well Suited for Brick-Based Adder Implementations i:j j ˜ j −1:k ˜other ˜i:shown ⇒ G + T · G = 1 as in Fig. accommodate this fact.48µm at a ror gates and “ • ” using the seven-transistor aoi gate for Gi 1 1 pre-processing 1 1 0 creates 1 1 block 0 the forbidden 1 0 at the inputs i :j j −1:k ˜a ˜ i :j ˜for state ⇒ G + T · G = 1 fix operator well suited for area-efficient brick-based adder implementations Now brick set the realization of arbitrary prefix i i i e0 only dif˜ ˜ brick-based design style advantageous. i :j i :j ˜ j − 1:prefix k of “adders Fig.t1i )) means that fact. 0 ) . G T+ 0t 1 1 on 1 to 1 →as 1 The operator. 1982) is. and due to fied operator. g t i− G Ti:ipairs i −1 ˜ i:i−1 ⇒G ˜ e generation. be used as a contact replacement with (g )to ) with means input and output pairs of i:i −1 property in a state-of-the-art 40-nm the resulting pitch in a that nand-gate producing T state . i− 1 ⇒this G fact. 1. outs are shown for minimum (MS) 40-nm rules.t iin in Fig. Therefore. This is a prei:i−1 :drick. so and H.dashed Noll: A modified prefix operator suited for first area-efficient brick-based truth table of the “ • ”-operator invariables Tab. as 54%. t ) = ( g + t · g ˆ . but none would be transmitted from weight i − 1 . implemented using mirare for minimum spacing Advan1 1 0 1 1”to 1 (MS) Derivation of brick set transistors. Now a brick set for the realization of arbitrary prefix adders 1.g + t · t ) . Adv. cell height of 1.5. The last property can be proved by Implemented athe state-of-the-art 40-nm technology. Implemented j− 1: k fact j− 1the :be kdashed – “1 ”. vs.T i :i −1 = 0area ) is (8) replaced by the are allowed state ˆ)at g ˆ. put pairstwo ofGoperator = 1 (Brent andcascaded Kung.t) g. Layouts are ⇒ g + t · g = 1 . so is. k k k k k k i+1 iproperties x crucial for adder efficiency. and due to the one for modified operator. using minimum-sized transistors. Kung. and due to the carry is generated because of shareable supply contacts of the mirror gate depicted in = 1 . =0) 1.net/9/289/2011/ i the brick approach.by s indicated dashed boxes in the examples. • : comparison layouts (MS rules). the rei :k i: k 0 –which 1 0 1 Fig. The fact that the signal pair .T = 0 must never occur Now a brick for the realization of arbitrary adders It is evident that the implementation of the operator cell(s) is To exploit this fact. T = 0 “ ” must bethe associative ’ ”-operator ’ CMOS A: −29% ’which ’ Using “ on the signal pairs generated in brick the set 4 Derivation of erator with den state n modern deep-sub-micron technologies for a which a brick-based design style is advantageous. 9. the 1). ˜ ˜ ˜ n inside of 29% clearly over-compensates area increase of 15% cells are cascaded horizontally. Implemented in a 40-nm kept a minimum.T =0 )1 appears at the jmodified −“ 1:• k ” using j − 1:k (?). the ratio To build a complete prefix adder. but none be “ transmitted from weight i mir− Fig.˜is layout adaption remaining cells to the layout cell height of using transistors. T = 0 must never can kept at a minimum. Application of of “would •”-operator to output of preprocessing j → k +2 : hould be kept at ia minimum. a modified prefix operator. and due to the because of j k 2: gmodified g t“ ”. Fig.T = 0 as 54%. 4. 289–295. ( 0 . i:i i:the i−1 allowed state 1 i1 ˜ ˜is ˜− = is replaced by (G 1It . 0 0 0 0 indicated 1 0 0by the 1dashed 1the Fig. output of preprocessing.t ·it ) . The area Figure 5 shows the comparison of “ ” implemented using i i or. Layouts i +1 i i : k i : k ˜ 1 1 1 1 1 1 1 1 ’ ’ −29% ’ ’ first three properties can be proved in analogy “ ”. wed state of wiring capacitance to analogy gate capacitance ever increasing e first three properties can be proved in to1rules.t = a + b 0 qed.33 µm. It evident that the implementation ofwhich the operator is proves are omitted.33 µm. • :: comparison of layouts (MS rules).1. ??operator. r1 with the the layout adaption of remaining of an crucial for efficiency. i tally. (8) the layout adaption of the remaining cells to the layout of an is indicated by the dashed boxes in the operator examples. using minimum-sizedtransistors. the difsulting cell pitch of the proposed version is only 0. the ratio of1wiring combination with a nand-gate T . T. Fig. Tocombination show effect of supply sharing.48µm a ocessing i :i −1“•”-operator i:the i−proposed 1 adder ble 1. 0). Rust and T. To show the effect of supply contact sharing. To accommodate this fact.is onlytransistors.33 µm. 1 ) and ( 1 . 5outs shows the of “”. using minimum-sized transistors. mirror “ •(” the seven-transistor aoi gate ’ and ’pair A: −29% ’ ratio ’ 4. Brent H. denoted asfollows transistors. 5. 1) and (1 . cell contains two mirror gates on top of each other as shown i efix tree. T ˜ =0 sharing. 2000). which cell height of 1. two operator cells are cascaded horizontally. Fig. transistors.48 µm at a cell height of i : i − 1 i : i − 1 width should be kept at a minimum. technolAdvanFig. Compared in the marked row.t . (7) i: − iFig. 1982). prefix Layouts are the G =1 . ?? . which Now a brick set for the realization of arbitrary adders is one the modified operator. i− i −1 of i :regarding i −1 ik :i −carry 1 5. denoted as height i a smaller area than one based on “ • ” in combination with k + 1 k + 1 k hich a brick-based design style is advantageous.t ) (ˆ g . Layouts are 1 1 0 1 1 1 ’ ” ’ using A: −29% ’ gate ’ high mirror gates andrules “ aoi for saving of 29% clearly over-compensates the area increase ow a properties brick set for the realization of arbitrary prefix adders wing tion based on DFM (Fig. for any Boolean g . G. is foras the adder efficiency. 1).in ?. the This area saving is as high Table 1. which by“• the dashed boxes the operator examples. vs. Therefore. prefix operator. This means that a brickperator cell which is implemented as efficiently as possible.T = 0 ) is replaced by the allowed state G for = 1 (R. of wiring to gate capacitance is ever increasing cement Now capacitance a brick set for the realization of arbitrary prefix adders Fig. ever increasing cell height of 1. (G operator which isstate-of-the-art implemented as efficiently as possible. Fig. ˜ ˜ It is evident that the implementation of the operator cell(s) is i : i − 1 idden state ⇒ G = 1 . which suggests owed state Table 1. T ) can In modern deep-sub-micron CMOS technologies for k k k k k k enerated at now only take the values (0 . Laylues (0 . two operator cells are cascaded horizontally. haring. 1 not contain the forbidden a prew.t ) ) means that input and output pairs of 0 i icell 0 approach 1 is implemented 0 0as efficiently operator which asthe possible.0 1) (sharing this ’ ’ −29% ’ ’on r with the the layout adaption of the remaining the layout of an sharing. Radio Sci. 1 ) (sharing this T. To (g accommodate this the bitoutput slice width be contains two mirror gates on top of each other as shown operty .t (ˆ gthe . vs. cell(s) k+1 k 0based 1 1 1 1 Itthe is evident the implementation of the operator + t’k+1 The modified operator. modified prefix operator well The suited for three area-efficient brick-based adderimplementations implementations properties can beadder proved in analogy to “•”. Rust and T. the re4 Derivation of brick set the proposed version is only 0. The area 1 1. 1). T = which is equivalent i : i : k g t g t G T :i−1 ˜ i:i −1 as possible. denoted as = 1. the bit slice width or the process of carry generation due should be kept at a minimum. To accommodate the bit slice eration due 4 Derivation of brick set (G = 1. vs. Fig. the ratio ulting cell pitch of proposed 0. • comparison (MS rules) is one for the modified operator. whichtechnologies strongly suggests In modern deep-sub-micron CMOS for which a row marked in 5.33µm. placement in state-of-the-art 40-nm technology. capacitance to gate capacitance is ever increasing (VeenTo exploit this fact.g + t · t ) . two operator cells are cascaded horizon0 0G 0 0 for “ ” exhibits the properties ply contact a smaller area than one “•cells ” in to combination with i 1. for any Boolean variables g . To accommodate this fact. t ) = ( g + t · g ˆ .by must idempotent 4Fig. 2000). defined asisin these ference isof of T in the marked row. ” implemented using mir. The fact that signal G . is a preoperator cells are cascaded horizontally. i:i− i:a i− 1 should kept at Therefore. Rust. crucial for efficiency. 5 shows comparison of ( “G1. • :: comparison of layouts (MS rules). input and pairs should of cell i :i − =1 . • comparison (MS rules) – “ ” must be idempotent of the following operator cells (Table 1). Therefore. is and defined as follows It is evident that the implementation of the operator cell(s) neration. the ratio 0 1 eight iof − 1 . To show the effect of supply contact k k k n inside athe operator cells are cascaded horizontally. layout of an op“ • ”-operator to output of preprocessing rding carry i : i − 1 i : i − 1 i hareable supply contacts of the mirror gate depicted in Fig. (8) adders This iscan a= in a nand-gate producing T . the operator cell mirror gates on top of each T (g irrelevant for the the process of carry due ˆcombination ˆ . To fact. to the classical implementai : j i : j j − 1 : k It is evident that the implementation of the operator cell(s) www. · g. vs. Application of of ˆ ˆ)”-operator ( G = 1 .adv-radio-sci. well g ˆ and I.33 µm. which drick. To show the effect of supply contact d defined as follows sharing. 5. • comparison (MS rules) 1 1 1 Compared these proves are omitted.i 0) . cell height of 1. The last regularity property can be proved by or gates and “ • ” using the seven-transistor aoi gate for G 1 1 1 1 0 0 1 0 tages regarding area and are evident. which G . Table 1following both do not contain the forbidden state. Application of “ •”-operator to output preprocessing. 1 0 0 0 1 0 0 0 0 i+1 i – “ 1 ˜ 1 1 1 1 1 ” remaining must bewhich associative – c dif= G operator ∀ i the r only cell is implemented as efficiently as possible. two Fig. To of accommodate this fact. i : i − 1 i : i − 1 Figure ?? shows the comparison of “ ” implemented us˜ shareable supply contacts of the mirror gate depicted in Fig. which is crucial for the adder efficiency. the bit slice width be the Brick-Based Design methodology which i ese proves are omitted. Implemented i:i− 1in i : i − 1 It is evident that the implementation of the operator cell(s) brick-based design style is advantageous. The fact that the signal pair ( G . Tii:i =1 1) the which is equivalent regarding carry of an op. 1). 5. 5. sup.33µm. 5. to0cell be used as prefix a replacement s derived on the presented modified operator.A: Tfollowing can only take the values (00 . the ratio i in combination To build a complete prefix adder.T = 0) at the ror gates and “ • ” using the seven-transistor aoi gate for G j − 1: k j − 1: k 0 1 0 0 0 ˜ ˜ requisite to use mirror gate at any position inside a prefix n 0 0 1 1 0 0 Tab. vs. Typically the area of prefix adders is dominated by t pairs of 1 cell contains two mirror gates on top other shown Fig.based 1) and (1. • : comparison of layouts (MS rules). 4. 5.top it is obvious that ⇒ g k +1 + t k +1 · g k = 1. vs. 1) and (1 . 0) . technology. two operator cells are cascaded horizontally. vs. 4. using 05. using minimum-sized transistors.T = mirror 0) is replaced by the allowed state •”-operator to output ofuse preprocessing. (0 . To show the effect of suptof G Tversion the proposed only 0. erator cell which– is implemented as efficiently as possible. the operator Fig. which examples. In modern deep-sub-micron CMOS technologies for Implemented in a state-of-the-art 40-nm technology. would be transmitted from weight i − 1 . preDerivation brick set ˆ ˆ ( g. a brick set for realization of arbitrary prefix erator cell which is implemented efficiently as possible. 5. so generation.tstyle = a kis +advantageous. 5. be used a replacement for ⇒ g k+1 + tk+1 · g k = 1. and due to the shareable supble do not contain the forbidden state. the operator cell gates on top be of each To this a modified operator.G. • :: comparison of layouts (MS rules). 5. using minimum-sized is derived based on the presented modified prefix operator. 5. theto resulting cellmodern pitch 2000). the operator i : i − 1 i : i − 1 erequisite to use the mirror gate at any position inside a operator cells are cascaded horizontally. • comparison (MS Derivation of brick set w.) 0) .net/9/1/2011/ their prefix graph implementation.33µm.33µm.T =1 1) that which is equivalent regarding evident the implementation of the carry operator cell(s) – 0) ci + = G ∀= itheir tree. ˆ the t (g + t · minimum-sized g. A: −29% ’ ’ (8) 0 0 0 0 k +1 k +1 k 0 0 0 1 0 0 1 that 1 is 01 ˜ 0 the 1 0 g cell(s) + tis · t =0 .the retages regarding and regularity evident.of two operator contains mirror gates on Operator top of each (g. which is indicated by the dashed boxes in the operator –only “ ”difmust be idempotent i +1 the classical to layout generation despite much ˜ – c = G ∀ i ?? do shows the comparison of “ state. Implemented now only take the values ( 0 . the ratio ecause table of the “the • ”-operator in Tab. G +T ·G =1 ˜ i:j ˜i:j ˜j −1:k = 0 e now only take the values (0 . the operator weight . Kung.t ( g4 + t · cell g ˆ . 4. Fig. so G in combination with a nand-gate producing T ˜ of = ˜using – c = G ∀i 15% the Brick-Based Design methodology derived based on the presented modified prefix operator. ?? ⇒ G T G = 1 This a . 5 shows the comparison of “ ” implemented using miri :ialready −1 = 1 ( gates and “•” using the seven-transistor aoi gate for Gi mpotent placement 4 Derivation of brickcell setror in a state-of-the-art 40-nm the resulting pitch rry G is generated because of2011technology. denoted as crucial for the adder efficiency. Noll: A modified prefix operator well suited for area-efficient brick-based adder implementations i: i − 1 proved by ror gates and “ • ” using the seven-transistor aoi gate for G of wiring capacitance to gate capacitance is ever increasing e “• ”-operator in Tab. T = 0 i 1 row ikept i drick. j− 1: k the −1: k•”-operator indicated by the boxes in the operator examples. So the forbidden state ˆ ˆ outs are shown for minimum spacing (MS) rules. marked inminimum. T ) can is crucial for the adder efficiency. Compared to the classical implementai – preprocessing: ntradiction as follows 0 0 n combination with a nand-gate producing T . 4. cell height of 1. 1) (sharing this i:j i:j inputs j− 1:kwhich a 0 0 0 = 0 0 ⇒ state 0 or on the signal pairs generated inthe the ˜ ˜ ˜ pre-processing block creates forbidden at the ⇒ g = a · b = 1 . the cell contains two mirror gates on top ofshow each the truth tablethe ofgates the “ ”-operator in Table 1. which is indicated “ • ” exhibits the properties of the proposed version is only 0.48µm ply contacts of the mirror gate depicted in Fig. This means that a brick˜ is+ ˜ ·in ˜ Fig. denoted as a the closer look at Ga modified andoperator T .33 µm. i:i− i:a i− 1 adder i :i obvious − 1 efficiency. vs. Therefore. 5. A Modified Prefix Operator Well Suited for Area-Efficient Brick-Based Implementations i:j iComparing :j −1: Comparing truth table of modified operator withwhich the truth table of the modified operator with ˜sharing. to be used as a replacement −1 i − 1 i : i − 1 i : i − 1 oriented implementation based on “ ” can be realized with each other as shown in Fig. i : i − 1 In modern deep-sub-micron CMOS technologies for which a mparing truth table of the modified operator with the the layout adaption of remaining cells to the layout of because a carry would be generated at brick-based design the ratio ofan wiring weight i . it is that ence is of T in the marked row. Layouts are adders – G = T = 0 must occur wed state of wiring capacitance to gate capacitance is transistors. 5 shows the comparison of “ ” implemented using mir˜ ˜ 0 1 1 1 1 1 “ ” must be idempotent G = 1 . G. 1Derivation both do contain the forbidden state.exploit T = 1)fact. 1) and (1. i1 :the i− 1 value is crucial for the which strongly suggests capacitance gate capacitance is deep-sub-micron ever To increasing (Veen“ ”.33µm.set with cessing. because a carry would bewidth generated at in combination a nand-gate producing Tstrongly .tk = which a + bk = sharing.48µm at 15% using the Brick-Based Design methodology which cell height of 1. he layout adaption of thethe remaining cells to the layout of an a smaller area than the one basedset on “ •” combination using with mir5 shows of “ in ” implemented the in the generated operator examples. 5. sulting cell pitch of the proposed version is ataa G +T ·Now T = 0 cells sulting cell pitch of the proposed version isonly only 0. Typically of prefix adders is dominated by to the gate at any position inside a prefix by the dashed boxes in the operator examples. which for G in combination a nand-gate producing T other as shown in Fig. 0 – G state = T – = 0 use must occur ontain the forbidden state. Derivation ofcomparison brick to the that carry is already because of ˜ their ˜ graph ⇒ G = T = 0 boxes prefix implementation. the re-version saving of 29% clearly over-compensates the area increase sulting cell pitch of the proposed is is only 0. the bit slice width should be 1 1 accommodate 1i :i −1 :i0 −1 = 11 ror gates and the seven-transistor aoi gate for G = 1increasing . which is indicated Gas + T Typically ·T = 0 area of prefix adders is dominated by ociative 54%. prefix Layouts are (Veendrick. sois cells are required: –shown “comparison ”1 must be idempotent Fig. “ be used as a replacement for ding carry ( ? ). Application of “ topreprocessing. the classical approach to layout generation despite the much Fig. To technology. vs. The forbidden state (G =01. Layi :i −with 1 in the ference is value of T marked row. which ( g. which strongly the sulting cell pitch of the version is only 0. “1 )”. 1982). “ 1 ”. 5 shows the comparison of “ ” implemented mir˜ ˜ 4 of brick G T of 0⇒ G ecause of y property ( g )0 ) means that and output pairs ˜ j −1:k = 1. T =width 0 → k+2 : 2000). the only difference is the value of T?. vs.48µm 0. is derived based the presented modified prefix operator. Therefore. the re. 1 0 1 11. So the forbidden state is derived based onthe thearea presented modified prefix operator. Now set for the realization of adders s 0 1 0a 0 0 01 enoted as requisite to use the any position inside prefix i mirror i be gate i−1 at ia − 1minimum. A: −29%(MS ’ ’ k cells k 5. • :: comparison of layouts (MS rules) rules). 5in shows the comparison ”producing implemented using operator.g + t· t (8) s is indicated byArea-Efficient the boxes inin the examples. his is) a4. Gi :pair Ti:ii:− 1 ˜ i:cell i−1 pitch Figure ?? shows the cell comparison of “ ” implemented = 1) which is equivalent regarding carry generation. thecell bit slice of T infor the marked row.t = a + b = 0 qed. T ˜i:k = 0 ˆ (G = 1 . “•”. T 1 ) which is equivalent regarding carry Comparing the truth table of the modified operator with In modern deep-sub-micron CMOS technologies for at a Implemented in a state-of-the-art 40-nm technology. (0. So the forbidden state → k + 2 : contradiction as can follows h is one for Table the modified operator. Layouts are in ?? . using minimum-sized transistors. t . accommodate this fact. 1 1 t0 is the implementation of the operator other as shown in Fig. T can kept at a minimum.T = 0) is replaced by the allowed state ( g. This is a pre0 0 0 0 position 0 inside requisite to the mirror gate at any a prefix ogy. In modern deep-sub-micron CMOS technologies for and H. T 1. (?). 5. i i ing mirror gates and “ • ” using the seven-transistor aoi gate ˜ ˜ tages regarding area and regularity are evident. and due to the aring this ply contacts the mirror depicted in Fig. 4 Derivation of brick set the fact that the carry is already generated because of ˜ ˜ Now a brick set for the realization of arbitrary is derived based on the presented modified prefix G + T · T = 0 ow. • : comparison (MS rules) contradiction as follows regarding area and regularity are evident. which strongly suggests The modified operator. usingand minimum-sized transistors. ˜the ˜joperator ak brick setthe for the realization of arbitrary prefix adders two are cascaded horizontally. The forbidden state (G is derived based on the presented prefix operator. Therefore. Comparing the truth ofprefix the regarding modified operator with 4. The area saving – “ the ” must be associative tion based on DFM rules (Fig. fact. T = 0 i : i − 1 ˆ The first three properties can be proved in analogy to “ • ”.the T fact. vs. the bit slice 1– ) 1accommodate 1 can ˜ ˜ at a and due to the shareable sup.T ˜ i = replaced by the allowed state ⇒g g 11. which strongly suggests “ • ” evident exhibits the following 1 1 15. k+1 t G 10 T the 1supply 0 contacts 0 “ •” 0 exhibits following properties 0 that 0 0 1 ⇒ g k+1 00 + tfor · g k = 1. To this Therefore. ayout adaption of cells to the0 layout in of an op(8) is boxes operator examples. 0 1 set 1 ifor 0 1 0 1 f tree. the other as shown in Fig. The area is indicated by the dashed boxes in the operator examples. 4 I. i−1 a brick i− : i − 1 i : i − 1 Now the realization of arbitrary prefix adders shareable of 1 the mirror gate depicted in Fig. the following additional for G with a nand-gate producing T i . 5. • :: both comparison of (MS rules). 1) (sharing this ply contacts of the mirror gate depicted in Fig. (0 . which is as ˜ ˜ ˜ iefficiently g Using ·t = 0.cell(s) these proves are omitted. :i −1k ˜ i1)vs.is 0 1never 1 minimum-sized Fig. was reported to the classical implementai . bricki iprefix i −graph i− 1 i −1 iThis −1 means in a state-of-the-art 40-nm the ˜ resulting 4.t) (Now g.T i = 0) appears at the j in combination a nand-gate producing T i . Fig. (7) contradiction as follows k +1 k +1 k i : i − 1 i : i − 1 r well suited for area-efficient brick-based adder implementations (G =+ 1t . classical is prefix introduced and defined as approach to layout generation despite theon much fdefined wiring capacitance to gate capacitance isisever increasing cells are required: k +1 as k +1 k derived based the presented modified prefix operator.t ) (ˆ g . (8) is indicated by dashed the dashed boxes theoperator operator examples..48µm at a (8) ˆ) = ˆ).t ) ) inputs means that input and 0 output pairs of of wiring G + T · T brick-based design style is advantageous. 0) . G. i erator i− 1 cell 1 on i :implemented i− 1signal :pair i −1 ( G =. i two The modified “the ”. the resulting cell pitch of the proposed version is only “ ”1 must be associative dden which a never brick-based design style is advantageous. To show the effect of supply contact Implemented inAdder a in state-of-the-art 40-nm thererede Noll: a prefix Implemented a state-of-the-art 40-nmtechnology. 1) and (1 . the ratio of wiring th table of the ”-operator in Table ?? . gray. g ˆ and i : i − 1 t.g ?? + t both ·t . which indicated ˜ ˜ to “ • ”. the bit slice In CMOS technologies forwidth T is irrelevant for the process of carry generation due ide a prefix mplemented inof a state-of-the-art 40-nm technology. the ratio sulting cell pitch of the proposed version is only 0.48µm at 292 I. is introduced and defined as follows layout adaption the remaining to theoperator. the G. Rust and T. cells are horizontally. the area saving is as high crucial for adder which strongly suggests oriented implementation based on “ the ” can be realized with i :j a i :j operator j− 1:k efficiency. using i :i −1 operator − 1( i :already i − 1i :icell suited for area-efficient brick-based implementations which is implemented as efficiently as possible. :· i− − 1 of minimum-sized ˜ i:k = 1. vs. . two operator cells are cascaded horizontally. mirFig.g ˆ with + tTo ·two t ). prefix Layouts are 0 brick 1 0with 1 0 1arbitrary tree. of should 15% using The modified operator. forbidden i creates the state at the the following operator cells (Tab.G g ˆ˜ .0 (0 .a brick-based 4 the I. of 1).48 and µm at a using cell of Now a brick set with for the realization of arbitrary adders To exploit this a modified prefix operator. 2000). So the forbidden state ’ at ’ to ’output A: −29% ’suggests 1. (g.t · t . two g + t · t = 0 . operator contains two mirror gates on top of each g 1 1 1 1 denoted as ’ · t = 0. oriented based on “ ” can be realized with i implementation i → 2: Ink + modern deep-sub-micron CMOS technologies for ˜ ˜ more limited layout features available ror gates “ •” the aoi for Gin – G = 1 . prefix Layouts are of proposed version is only 0.48 µm at a cell height ing of mirror gates and “•” using the seven-transistor aoi gate i:i−1 ˜ i:is i−1 – “ the ”now must be associative ˜ that signal pair ( .

all bricks identical in both adders except 1.33µm. ing the brick set small.0 1) (sharing thisfeature 0 take 0 the 0 0 the based on the proposed operator exhibits an area 29% smaller tors (see Fig. g .. (0 . g . 1) (1 . the model. vs. All bricks minimum-sized outs shown forvalues minimum (MS) rules. 7 shows the corresponding layouts. In both cases the bit slice width adder based on “ ” is 11% less efficient with respect to the is derived based on the presented m “ ”. respectively. simulated at 500 MHz un. denoted as 1 1 1 1 1 1 i i i i – generation preprocessing: – of g .G. To XOR accommodate th T is irrelevant for the process of carry generation due – a single mirror-gate operator cell to use prior to placed by the allowed state of wiring capacitance to gate capacitance is ever increasing – buffer(s). Tooperator accommodate this fact. occur property )and )the means input output pairs of 0with0 (g 0 1 thatthan 0 are 0 i . However. row marked in gray. is introduced and defined as follows bk = 0 qed.9V) of 1121ps (’ ’) of and 1128ps (’•’). 1. the following additional irequired: i ˜ ˜design erator variant using a physically oriented flow never and occur 2). using minim and a well contacts are not needed in every single roperties substratei :i−1 0. 1) 1) (sharing this 0 and (1. ”. 6. vs.8 µW almost vs. 2000). (0.modified Compared the classical implementathe one based on the classicalversion Brent&Kung approach. to sobuild the quantative comparison of the pro0 1 0 0 0 0 of 15% using the Brick-Based Design methodology which requisite to use the mirror gate at any position inside a prefix tional adder on the area of the bricks used in both versions sky topology is based on its advantages in the ultra deepposed prefix operator with the B&K type on the Sklanon the proposed operator exhibits an area 29% smaller than ast property can be proved by 0 1 0 ultra deep1 to is 0 1 the eliminated. ˆ t ˆ +t ·t (8) www. capacitance to gate capacitance Taking closer lookdesign at G style and . i i. 0 the 1comparison 1 set of 1 “ ” implemented 1 1 using mircells are because required: of 5 shows 4Fig. 1 that atation 0 brick-oriented implementation based on “ ” can based on DFM rules (Fig. two operator cells are cas he forbidden This is in Fig. to the fact that the carry is already generated because of is equivalent regarding carry (?). In adhigher (222..g + t · t (8) is indicated by the dashed boxes gate at any position inside a operator cells are cascaded horizontally. a modified prefixin operator. using minimum-sized transistors. which brick-based design style is advanta weight i .complete prefix adder.1prefix Layouts are er Now a brick1set with for realization of 1 arbitrary adders ed occur prefix operator. • comparison ratio imum of wiring capacitance to gate input capacitance. multiple 5 comparison of operators tages regarding area regularity evident. T =0 The forbidden state (G = 1. the only difoperator cell which is implemented 0 generation 0 that 0despite 1 0 0 1 the tion with classical approach to layout their prefix graph implementation. This is a preKheterpal et al.T i:i−1 = 0) appears at the clearly over-compensates therules area increase 15% using the 0 ing isBrick-Based 1as high as 54%. t and t . Derivation of brick lready generated To build a complete prefix adder. the operator cell contains two m Tocontacts exploit of this fact.0. because a carry would be generated at requiring nor or nand. j → k+2 : 1 as 1 dominated by their prefix graph implementation.t ) • (ˆ g . All state. 5peripheral shows the comparison –desirable.adv-radio-sci. the classical approach to layout generation despite the much 0 1 0 1 0 1 To build a complete prefix adder. which is indicated 6. dedicated substratecolumns tors (see Fig. Typically the area properties of prefix adders is dominated by usedslow-corner ◦ C. “ desirable. It adder on the area of the bricks in both versions is elimcombines minimum possible logical depth with the minWhile featuring almost identical delays (slow “• ” exhibits the following as 54%. •”-operator t . it is obvious that tor. vs.or to output of preprocessing. betage used as a replacement for tion based DFM the area saving is asbit high 1 1 0 0 1 0 eratorthe is beneficial regarding a brick-style implementation. the dition. The fact that the signal pair ( G .0 i:j i:j ˜ j −1:k Using the “•”-operator on the signal pairs generated in the ˜ ˜ ⇒ G +T ·G =1 that input and 0 output pairs of Design 0 using the Brick-Based methodology which creates was re-the forbidden state at the inputs pre-processing block i:j i:j ˜ j −1:k shown for minimum spacing (MS) rules. Typically thethan areaone of prefix is dominated by g k+1 + tk+1 ·gn/g tk = 0. denoted as minimum-space design rules. 5. theCompared following operator cells ing area and regularity are The area saving of 29%(Tab. ference is the value of T in the marked row. 6). – preprocessing: It is evident that the implementa ˆ) = (g + t · g. In modern deep-sub-micron CMOS technologies for i:i−1 i:i−1 – prefix graph: – buffer(s). – a single mirror-gate cell to use prior to the bit slice width should be i:i−1 i : i − 1 i : i − 1 ˜ ˜ shareable supply contacts of the m G at a= 1 (R. If the sky topology based on its advantages in the submicron domainis (Patil et al. ˆ. 2011 efficiency is crucial for the adder Comparing the truth table ofwww. t t · g ˆ . 1 both do not contain the forbidden state. Rust. the area saving isbe asrehigh operties 0 0 0 −29% 0 1 alized 1with ’ 0” ’ in A: ’ 0’ combinaa smaller area basedadders on “ as 54%. multiple i:i−1 −1 well ˜mirror ˜i:iand ing gates and “• ” which using the seven-transistor aoi gate (G = 1.In both cases the bit slice width ( G .5. 1 1 1 of 1the larger 23%. was reported in ?. In terms ATE efficiency. which tackles the aforementioned increasing Fig. 198. Apart from are used. to be used as with a replacement a state-of-the-art 40-nm technology. sulting cell pitch of the proposed v tion a strict horizontal wellandI.net/9/1/2011/ the modified operator with the the layout adaption of the remaini truth table of the “•”-operator in Table ??. gnal pair (G – post-processing: . a modified prefix operator.t ) (ˆ g . and due to the shareable supsingle Operator 4. T = 1) is equivalent regarding carry is determined by the width of the operator cell.g ˆ). substrate orientation. ( g. modified operator with Fig.adv-radio-sci. NAND In modern deep-sub-micron CMO r in Table ?? . (7) G T i : i − 1 i : i − 1 ˜ ˜ signal pair I. The To drive the high-fanout wires.6. 5. i i i − 1 i −1 i :i −1 i :i −1 instances of buffers were used. ” must be idempotent Derivation of brick the fact that the carry is available already because age makes a small logical depth der typical conditions).5% – c implementation = G ∀i oriented based on “ ” can be realized with While featuring identical slow-corner delays (slow fanout). the BBL A 64-bit Sklansky adder was implemented for each optaken into account for the conventional adder. – generation of p is crucial for the adder efficiency. (g. the operator – post-processing: other as shown in Fig. which cell. T post-processing.t) (g. 0 Fig.tof i theand one on thearea classical Brent&Kung The decision to build the quantative comparison proinstances ofbased minimum-sized buffers were used.. between threshold magnitudes and supply volt.33 µm.48 of the proposed version is only µm at a cell height of is determined by the width of the operator cell. approach. 2007) were the rules modified opinfluence of slice pitch inside the conventional erator is beneficial regarding a implementation. “ ”. the increased delay per – gate caused by the increased der typical conditions).0 µW. 0)spacing . the ratio i:i−1 drick. are identical in both adders except for G in combination with a nand-gate producing T generation. To drive the high-fanout wires. simulated at 500MHz unratio between threshold voltage magnitudes and supply voltmore limited layout features in the brick age makes a small logicalto depth the generated aforementioned net logic area 4 of the bricks is set of “ Fig. • comparison of with layouts (MS (MS rules) rules).of layouts ◦ a smaller area than one based on “ • ” in combination with ratio of wiring capacitance to gate input capacitance. 1). Application of “ • ”-operator to output of preprocessing.and ).t · t ) . All bricks feature minimum-sized transisi : i − 1 i : i − 1 Figure = ??1shows the comparison of “ ”the implemented us. 198. respectively.generation BBL version is 26% better than the conventional higher (222. It The modified “ (Fig. 1). the BBL-based version has still an area advantage of combines minimum possible logical depth with min1 1 0 ofthe 1 1 10. It is evident that the implementatio as follows and two inverters. the “ ” must be associative the increasedvoltage delay per gate caused by theto increased power dissipation of the BBL-based adder was about one. the following additional more limited layout features available in the brick approach. T = 0 must in combination a nand-gate p Now a brick set with for the realization To exploit this fact. Apart from tion with strict horizontal welland substrate orientation. inated. T post-processing. Typically the area of prefix adders is Design methodology which was reported in (V. 1) (sharing this ply the mirror gate depicted Fig. 1). which strongly suggests layout adaption of the remaining c and two inverters. The last replaced by the allowed state contradiction as follows ch is equivalent regarding carry 1 i :i −1 i :i −1 ˆ ˆ ( g.T = 0) is replaced by allowed state minimum-space design rules. g t g t G T i:iminimum-sized −1 all ibricks :i− 1 Lay˜ ˜ the cell width.of ⇒G = 1. 289–295. are used. which tackles(paying the aforementioned increasing power dissipation of the BBL-based adder was about 12. t1. T ) can Figure 7 shows the corresponding layouts. w i i Application of “nand. 5. If saving of 29% clearly over-compensates the area increase 0 0 1 1 0 0 Tab. 4. on vs. 12. t ) = ( g + t · g ˆ . To show the effect of supply contact sharing. vs..0 µW. submicron domain (Patil etbrick-style al. (G . Therefore. Brent and H. 0 C. ) Noll: 293 G 0 I. (2005). Kung. inside the version based The decision e proved in analogy to “•”. 1982). To show t – xor or xnor gate. cell. G. This means that a brickimum amount of wiring the price of max23%. and the additional 1 following 0 0 1 0 gate for Gi ror gates “•1 ” using the seven-transistor aoi – preprocessing: cells are required: NOR i 1 the 1 producing in combination a0 nand-gate T . is introduced and defined as follows hat input and output pairs of cell contains two mirror gates on top of each other as shown sharing. G.5%If the classical approach layout despite the much ratio dition. This means a brick0 0 1 1 0 Fig. i It is i that the implementation of the operator cell(s) evident crucial for the adder efficiency. an advantage −1 = build a ror gates and “•” using the seven Gi :iTo 1 (? ). the only difoperator cell which is implemented as efficiently as possible. • :: comparison Fig. Additionalbricks. bricks All additional bricks that were implemented are shown in Fig. Compared to g the implementi g i ti−1 Gi:i− T Buffer ⇒ g k+1 + tk+1 · g k = 1. requiring nand. 9. Radio Sci. This is a prethe influence of the larger bit slice pitch the convenposed prefix operator with the B&K type on the SklanFig. So the forbidden state which aa brick-based is T advantageous. the of 15% in ATE efficiency remains (Table cells are – G = 1 . In admodel. to1). Inof terms of ATE efficiency. Advannoware only (00 . he modified operator – with the erator cell which is implemented a the layout adaption i of the remaining cells to the layout of an generation of pi from either g i and t or g i and ti . To use this as an advantage and at the same time keepfor the operators. dedicated substrateand well columns Table 1. 6. should be kept at a minimum. i+1 i in terms ˜ 1 1 1 1 1 1 imum fanout). BBL-based has still an area advantree. • comparison oriented implementation based ’available ’::can bein realized the much more limited layout features the brick 0 1 0 0 0 0 a smaller area than one based on ’•’ in combination with approach. is one for the modified operator. Noll: A modified prefix operator suited for area-efficient brick-bas In consequence of the very small cell pitch in combinatruth table of the “ • ”-operator in the Tab. thetransisversion 5 Comparison of operators for and operators.8 µWapproach. Fig. two – xor or a xnor gate. resulting pitch 4inthe Rust and T. j − 1: k j − 1: k 0 position 0 ˜ ˜ e at any inside a prefix i: i−1 the area savimplementation based on DFM (Fig. 6). 0 ps (“ ”) and 1128 ps (“ their prefix graph implementation. 1) and (1. 5. denoted as (0. This to be used a replacement for i classical −means 1 1 i:i−1 Kheterpal et al. nor ⇒ g k = ak · bk = 1. 2005). Additional Fig.net/9/289/2011/ Adv. Implemented in a state-of-the-art by the dashed boxes the operator Implemented In consequence of the very small cell the pitch in in combinaComparing truth table of the examples. T ˜i:k = Rust and T. Sovariant the forbidden state ator using a physically oriented design flow and ing the brick set small. Advantages regard˜ ˜ G +T ·T =0 0 ported 0 in e forbidden state. Noll: A Modified Prefix Operator Well Suited for Area-Efficient Brick-Based Adder Implementations 5 0). defined as t these proves are omitted. requiring nor to output of preprocessing – generation gTable . additional bricks that were implemented are shown in ˆ) = (g + ˆ).9 V) of 1121 ’ ’ A: −2 imum amount the of wiring (paying the price in terms max”). 1 Now a brick set for the realization is derived based on the presented m is derived based on the presented modified prefix operator. ) can kept minimum. but none would be transmitted from weight i − 1 . T. substrateTo use this as well an advantage andnot at the same time keepA cell 64-bit Sklansky adderare was implemented for each operand contacts are needed in every single the width.wellcell height of 1.T can A modified prefix operator well suited for area-efficient brick-based adder implementations ˜ i:k = 1. to the classical of evident. ??. (8) of i from either g i and t i or g i and t i . requiring nor or nand. the cell only dif. – prefix graph: 1 in the marked row.tk = ak + bk 4 Derivation of brick set tn/t . 2007) were the op0 on 1operator.

The first stimulus (i) is based on 1 3. 1 ) and ( 1 . This is ephasized by the power i gates 1 increase compared to classical CMOS was between 13. 1982). Kung.tSklansky ) (ˆ g . This makes it necessary to i : i − i : i − 1 prefix tree. to be used as a replacement ˜ ˜ in a state-of-the-art 40-nm technology. it is obvious that drick. 5 shows the comparison The modified operator. 1982). et al. adder should be kept at a and due to the property with ( g . the operator ˜ 1 +T ˜ 0 ·T ˜ 1 =0 0 layout adaption of the remaining cells 0G 1 1 1 1 4 Derivation of brick set j − 1: k j −1 1:k ˜ ˜ 0 1 1 1 1 1 0 erator cell which is implemented as ef 1 0 ⇒G = 1. is but none would transmitted from weight i− 1 . ratio ATE-Efficiency (normalized) 1. Application of layout “•”-operator tokoutput ofremaining preprocessing should at cells a minimum. that 1 g Taking 1 +t 1 1 look 1Gi:i−1 and 1 T i:i−1 . using minimum 4. at least version for the considered topology andfraction transistor – = G ∀i derived based on the presented “ ”. modified operator. :: comparison (MS rules) rules). Noll: A modified prefix operator well suited for area-efficient brick-based a crucialthe for truth the adder which strongly suggests table efficiency. as well property with (g– ))was means that input and output pairs of cell two mirror on top of of each other a d operator witha the the layout of the remaining cells to the layout of ancontains ˜ for ain the forbidden state. 1 +t ·t (8) (G = 1 . which – G = 1 . However. default values. only difoperator cell which is implemented as ror and “ • ” using the seven-transistor aoi gate tioned short circuit current it may cause. th is one for the modified operator.5. vs. crucial for the adder efficiency. This is due to increased power 4 due I. the gates in FS simultaneously. which is i marked row. vs. Therefore. 0 0 0 1 0 0 i : j i : j i: iminimum.g ˆ 1 + ).33µm. ratio nerated because a 500-cycle random simulation which at of typical path delay between T andthe G can accumulate. resulting in the forbidden state tively. In the conIn modern deep-sub-micron CMOS perator. 1i 1 1 1 sharing. horizontally In modern technologies for which a (GiArea = 1. (0efficiency. the operator cell contains two mirror gates on top of each (G = 1 . On the other hand. supply current/mA 4. The stimulus (ii) tries to establish the FS Table ?? both do not contain the forbidden state. T =1 1) which is equivalent regarding carry the signal pair (G can i ) between iThe i− 1 cells :i − 1cascaded i :ibe −1horizontally. a modified prefix operator. of the t G T i : i− 1 i :i − 1 i( :i − 1 i : i − 1 ˜ ˜ on ’ ’ is 11% less efficient with respect the energy-delaytransition of G or the fall transition of T ) and at the ˜ ˜ G . .V To show (BSI). which 0 fraction 1 0the power 1 0 caused 1 1 to be used 1 nals 1 To determine the increase by can occur. the ratio in a state-of-the-art 40-nm technology. this using minimum-sized transistors. 4. additional adder based at the current peaks (’ ’: -15%. summarized in Table 2 that the adder based ” denoted as a replacement for ference is the value of T the marked row. it is obvious that capacitance to gate capacitance is e a closer at It is evident that the implementation of the operator c crucial for the adder efficiency. ˜ i :i −1 . to be used as a replacement in a state-of-the-art 40-nm technology. called DELVTO have a large impact on delay. This is a in Fig. Fig. the conventional adder was – “ ” must be associative ing mirror gates and “ • ” using the se prerequisite to use the mirror gate at any position inside a both adders. 0) . 1two ) and (1properties . the operator cell contains two mirror gates on top each odified prefix operator. brick-based design style is advantageous. of “ • ”-operator to output of preprocessing. and due to the sup. This observation in combination with a nand-gate – G = 1 . possibly leadrator in Table ??.446 0. 5. it Delay 1121 brick-based design style is advantageous. alent regarding carry i −1 1 i i state pation of 1 theand mirror the forbidden can much 18% forpeaks the conventional adder.2 be kept 546. Rust and T. using transist version is 26% better than the conventional one. which In modernbydeep-sub-micron technologies for Implemented the dashed boxes CMOS in the operator examples. it isA obvious that µ implementations . Kung. the bitat slice width now only take values (0 . which is observable at seven-tran i : i − 1 ror gates and “ • ” using “ • exhibits the following properties G = 1 (?).H. is introduced andnetwork defined The peak current in the supply isas an follows indicator for the increase compared to classical CMOS was between is(4.2% (max. T 1 ) which is equivalent regarding carry strong short circuit current it may cause. so delays corresponding G and T sig0 t i− 1operator. 1) and (1 .tk the = a carry +b = qed. the adder based it. the state can be much r gate at any position inside a prefix dified prefix operator well suited area-efficient brick-based adder implementations Fig.48 µm at a cell h ent regarding carry (?).T = 1. 0) . 1. theatonly difoperator isthe implemented as efficiently ror any Table position inside a cell variability scenarios based on which the layout-derived netlists of operator cells are cascaded horizontally. (0 . 0 0 Tab. This parameter used aggressive c =Fig. The fact that the pair (G . using minimum-sized transis i + 1 i the same setup (Table 3). so using us– c =G ∀i first three of 15% using the Brick-Based Desig d ( 1 . which cell height of in 1..48µm at a ˜ the ˜ show that ratio the Vth distribution for nominal de-FStechnology capacitance tosuited gate capacitance is signal ever increasing (Veengeneration. vs.ti i+1 i that CMOS gate under the influence of glitching. (2006) show that the ratio of the distribution erator cell which is implemented as efficiently as possible. Kung.)/ps capacitance to defined gate from capacitance increasing (VeenGi:i−1 and is obvious aoi/oai two operator cascad i:i−1 i:i−1 Max. the operator modifications as well as under the influence of worst-case Vth (FS). This that observation of the forbidden is responsible for most increase. the only difoperator cell which is implemented as efficiently possible. in Fig.60 4. It is evident that thethe implementation of the the operator is with Comparing truth table of modifiedcell(s) operator sulting cell pitch the proposed versi shareable supply contacts of theof mirror gate depicted Gi:i−1 = 1 (R. model To show the effect of supply contact sharing. effect of the FS be clearly dominated by other tive It is ephasized evident thepower implementation t and 0 output pairs of– “ofstate denser wiring the BBL version. result. t ) =the (g + t·g ˆ. which strongly suggests occurrence were simulated. based k +1 Ak +1 T i:i−1 6 I. because carry would be generated at brick-based design style is advantageous.T (Tab. thethe only dif-of put of preprocessing cell height of 1. (0 .48 now only take the values (0contains . shareable supply of mirror gate depicted Fig. T ˜ i :i −1 ) can i +1kept saving of 29% clearly over-compens 1.wiring 289–295.396 i:i−1 1 drick. the operator cell contains two mirror gates on top i : i − 1 this fact. contacts of the as mirro G = (R. “ ”. Noll: modified prefix operator well for area-efficient brick-based adder the table of the modified operator with ”-operator in Tab. current: 18% to 22%).7. This is adown pre.transitions that are needed to exit 2). nominal devices at nodes below 90nm can beare up to 30%. the power by(’12% for BBL far from ideal.26(1. Rust. while the conven– c = G ∀ i sulting cell pitch of 1 the proposed version is only 0. 9. − 1 i i rise transition of G affect or the fall transition of T of ) and at the ference is area the value of T bricks in the marked row. 2) at once. which strongly suggests the 0 to output of preprocessing k+1 +1 k -operator −1 g T i:i+ tkis · t = 0. This is a preresults of five simulations based on normally-distributed ran5 shows “ ” implemented using mirThe peak current in the network is anthe indicator for tent 4Fig. respec– “ ” must be associative marked row. a modified prefix operator. −1 g. introduced andcaused defined nd (1. 4. On other hand.15) 11128 weight i.G. i:i−1 Delay (w. area comparison. higher due to c the stronger current. 5. vs.396 technologies for M Hz 4 Derivation of brick set other as shown in Fig. T can kept at a minimum.t iadaption in . which to model the effects V variability.15) 1 Implemented in a state-of-the-art 40-n 1 bein kept to the fact that the carry is already because of gateshould shareable supply generated contacts of the mirror depicted Fig. The fact that the signal pair ( G . To accommodate this fact. thereby the gates in i :i1 −1˜ i :i − 1 have aG large impact on delay. Radio Sci. 0 (8) tional version only suffered an increase 0. that the contain forbidden time slowing is one for the modified operator. and due to the y is already generated because of based is derived on the presented modified prefix operator. the dissi˜ 1 1 1 1 1 1 dified operator with tively. as ned as0follows the forbidden state is responsible for most of the increase. 0. the power increased by 12% for BBLthe ’ ”-version ’ dissiA: −29% ’ caused ’ based on “” ” with the bit slice pitch from the the1 “ power dissipation by the FS. the ratio of wiring The research results published (G design = 1. 0) .T ) can (00 . To the show theo e “ ”. To accommodate this fa BBL Conv. 2005). In reality. the bit slice width should 1. So can the be forbidden state which a brick-based design style is advantageous. 1) (sharing thisfact 0 0 nals temporarily. two operator are which BBL adder was further increased by 6%. 0 0 0 1 that input 0 0 at the input of a receiving Compared to a regular 1 1 the forbidden 0 1 1tive 1 the ’ ”). “ ”. The simulations made th take a closer look at the The consequences for the forbidden state i : i − 1 i : i − 1 sidered nanometer-scale technologies the variability of the FS at the input of all operator bricks in the first row of the modified operator. generated layout adaption of the remaining to the layout o 0 erator which is implemented as efficiently as possible. G.33µm. 1 ) (sharing this ve Figure ?? shows the comparison of “ ” impleme ply contacts of the mirror gate depicted in Fig.60mA) the fact that.5 transistor models feature a parameter worst-case Vth settings.T = 0 ) can not appear when ideal switching conrequisite to use the mirror gate at any position inside a prefix operator. Rust and T. the power dissipations of both adders (Gi:kstate. vs. the resulting c the allowed state ng properties of capacitance gate capacitance is ever increasing of the proposed version is only 0.occur. So the forbidden account for conventional adder.2% (max. This is by the ” must be associative ˆ)an ˆ). current: 18% to 22%). To exploit this fact. 2011 www. the only difcell height of 1. 2 show that the adder based on ’ ’ exhibits a pected due the FS. The current is indicated by 1the the dashed in the operator examples. Area/µm 388.without the bit V slice width ex s .⇒ T.net/9/289/2011/ i = 1. becomes that the transient presence results of five simulations based on normally-distributed rani :on i −1“ in ˜i ˜ishow and 13.34 operator 3. which al. temporarily.48µm at a 0 0 for 1 forbidden 0 0 0 1. ˆ t (g + t1 · g. (8) is indicated by the dashed boxes inadder the operator ex As a first step. This is a prepation of the mirror gate in reality. two operator cells are cascaded horizontally. resulting cell pitch – “ ” must be idempotent outs are shown for minimum spacing The modified operator. The first stimulus of (i) the is based on gate depicted i :i −1 in the now only take the values 0 . G ∀?? ito take closer at the consequences the forbidden state without Vth modifications as under gates the influence Figure shows the comparison of ?? both do not contain forbidden state. Surprisingly. ws tive default values. 0). apparent dified1prefix operator. vices nodes below 90nm be up 30%.of of Tcells to in the marked row. dom V variations covering all transistors. To accommodate fact.34mA compared 3. the effect of the FS can be clearly dominated by other 0 Comparing the truth the modified the osition inside prefix layout of the remaining c domoperator V covering transistors.T = 0 ) is replaced by allowed state of wiring capacitance to gate capacita impact of the forbidden state because of the aforementioned and 13.1% i : i − 1 truth table of the “ • ”-operator in Table ?? . thereby maximizing (8) voltage for “ •for ” exhibits the following of the proposed version is Fig. both do not contain the forbidden state.look This is a i . vs. the bit in slic T is irrelevant for the process of generation due should be kept atcarry a minimum. 0 1 0 0 0 0 1 0 1 higher due to the stronger short-circuit current. Therefore. as atree.5. 5 shows the comparison of “ ” implemented using mirference is the value of T in the marked row.33 µm. transitions All Vth variations had apossibly magnitude of of the respecATE efficiency remains ing which could lead to30% the FS (e.5 transistor models feature a paramthreshold Vth can no longer be neglected and can adder’s prefix graph (Fig. + t · t (8) sharing. is introduced and as weight follows 1 T i:i−1 . butthat none would be transmitted iis − ever 1.2 546. slice width should be maximizing operator. To show the ?? effect of supply contact shar e ??gate . isTo by the allowed state Now a the brick for the realization of arbitrary prefix adders of wiring capacitance to gate capacitance is ever in hich is equivalent regarding carry (? accommodate this fact. die power dissipation i: j other as shown in 5.48 µm at a cell height of Adv. 2) at once. denoted as other as shown in Fig. an additional adder the conventional version is due to the decreased fraction of of “ ” im Fig. In the conbe fullfilled. Layouts are never occur Now a brick set for theof realization of arbitrary prefix adders G so it = 1 (?). two operator cells cascaded :deep-sub-micron i−1is indicated :i−1 by theis dashed boxes the operator examples. T = 1 ) which is equivalent regarding carry on the presented modified prefix operator.c. 7. G.T = set 0 ) ).T ) of can . 1 1 prefix version is still 10% higher than the one based on the classical shows that. The power dissipation of this version widths. To show the effect of supply contact fined as follows To exploit layout ference adaptionis ofthe the value remaining the layout of an op. So the forbidden state which a brick-based design style isop a is derived based on the presented modified prefix “ ”. minimum-sized transistors. in the only difwhich isthe implemented as efficiently as row. (0. the only dif. s replaced A bystrong the allowed state increase of supply current amplitudes to FS with increased duration. that can is discussed below.is ?? . 1 ) (sharing this ply contacts mirror ing to FS withoperator increasedcell duration.can • comparison (MS rules) approach. denoted as 4 I. 5 shows the comparison of “ ”power implemented us ror gates and “• ” of using the seven-transistor aoi gate for all G th variations theaimpact offact state because oftable the aforemen– “the ” forbidden must be idempotent 4with Derivation of brick adaption setwere to the that the carry is already generated because of is still strong 10% higher than the one based on the classical apsources ofthe power dissipation.33 µm. BSIM version 4. To show the effect of supp ror gates and “ • ” using the seven-transistor aoi gate for G sidered nanometer-scale technologies the variability of the is one for the modified operator. drick. is introduced and defined as follows sharing. which operator.48 cell height of 1. from ideal.were se signal pair (G FS simultaneously.60 mA) in a 500-cycle at typical condiˆ) = ˆThe :i − − 1= (g. called DELVTO thshown 3· σ other as in Fig. vs.9) to is 0 already because ⇒ gk =the ak · fact bk =that 1cell . switching conditions are i:j As a result. “ ”. • • comparison of layouts (MS tions. 1) and (1 . using minimum-sized transistors. 1. which imum possible increase must be determined. which indicated (FS). 1to 0 1 far 0rise 1 contact ditions are assumed. 0 1 1 1 1 1 0 1 0 at the input of a receiving mirror cell. switchtruth criterion table of the “ •”-operator in to Table ?? . so it becomes apparent that increase the transient presence widths. Brent and H. only4. 0 1 0 0 0 0 4. It was already shown in Sect. replacement for of o the forbidden state ut of preprocessing. ATE-Efficiency (normalized) 1. 1982). the Implemented in state-of-the-art 40-nm re. and du . = 0) is replaced by the allowed state which is equivalent regarding carry i −1 i :i −1 i :i − 1 0 Fig.which 1) gates (sharing thisof ply contacts of the mirror gate depicted in the is crucial the adder strongly suggests ns that input and output pairs of settings. Noll: A modified prefix operator well suited for area-efficient brick-based adder implementations brick-based design modified style is advantageo −1 weight i . The second stimulus (ii) tries to establish the by the dashed boxes in the operator ˜ ( = 1 . requisite to use the mirror gate at any position inside a prefix ation. and du the truth table of the modified operator with the layout adaption of the remaining cells to the layo path the delay between Tas and G can accumulate. To accommodate this fact. 1 both do the forbidden state.33 µm. The greater increase 1 gate 1 in 0 1 1 be 1occurand 1 Ti:ii:− ’ ’ A: −29% ’ prod ’ of was realized characterized. the Fig. therefore the maxing activity is limited because propagate conditions need to (G = 1.T ) can i+1 5.g + t · t ) . 4.33µm. All V had a magnitude of by 30% of the respec“ •” values exhibits following properties th variations now only take the (0 . which strongly of the modified operator with the the layout adaption of the remaining cells to the layout of an the fact that. (g. switchof wiring capacitance toa gateing capacitance is ever increasing at the input of all operator bricks in the first row of the adder’s i i : i − 1 i : i − 1 prerequisite to use the mirror gate at any position inside a threshold voltage V can no longer be neglected and can operator cells are cascaded horizonta in combination with a nand-gate producing T the . w 0 n state. So the of forbidden state 4 Derivation brick set a conditions.5%. the bit slice width he process generation due Now a (brick for realization arbitrary prefix adders ˆ ˆ).g. 2000). 5. denoted as 2 the µW 1 Inratio modern deep-sub-micron CMOS rowweight marked in gray. the (G =1 . 1) (sharing this eans and output pairs0ofmirror 0 0 cell.t ) ) that input output pairs of generated because of dissipation that is discussed below. ’ of A: −29% respecpeaks were increased by 12% die (“ power ”) anddissipation 7% (“ It was already shown in section 3athat state technology. the effect of supply “ ”. 2000). Noll: Modified Prefix Operator Well Suited for Area-Efficient Brick-Based Adder Implementations is derived the presented prefix ope It thebe implementation of the operator cell(s) ison · evident gk = 1. The fact that the signal pair (mirror G .TThis = 0) not appear when ideal switching BBL adder was further increased by 6%.446 0. Layouts are th knock-out criterion for usage of ’ replaced ’. denoted as prefix bit graph (Fig. denoted as is due r (G . i along paths between weights. sources of power dissipation. In deep-sub-micron CMOS technologies for knock-out for the usage of “ modern ”.As a property with ) ) means input and output pairs of 0 (g 0 1 0power 0 1 forbidden 0 0 n the is a can prei:k 0 i . 5.1% transient ’It ’ A: −29% ’ the ’ due ” compared to occurring in the adder based “ i :i −1 on i :i −1 mA ˆ). 5. gate capacitance is cells ever are increasing Taking a closer look at G and T . T = 0 must never occur in combination with a nand-gate producing T i . T = 0 must never 1 i : i − 1 ˜ ˜ Now a brick set for the realization Toshort-circuit exploit this fact. Derivation ofcomparison brick set of is already generated because of supply i ized and characterized. = 1 .c. dissipations of both adders in and 18% for the conventional The greater increase ”. Application of “ • ”-operator to output of preprocessing. the difference in i :i(g −1 ing activity is limited because propagate conditions need to technolo property with . i i i −1 i −1 i :i − 1 i :iwhich −1 3·σ ble with versions could affect a transient occurrence of the g t g t G T Implemented in a state-of-the-art 40-nm i : i − 1 i : i − 1 −1 of the modified operator sulting cell pitch of the proposed version is only 0. all transistors in the netlists of both i : i − 1 i : i − 1 Implemented in a state-of-the-art 40-nm technology. 64-bit 64-bit comparison ˆ) adder: ˆ (g. denoted as i:i−1 i:i−1 2). Impl the “ ”. (0. the only difing transitions which could possibly lead to the FS the impleme T i:i−1 in the marked row.34 evident that the implementation of cell(s) pected tothe theoperator FS. T a brick =0 = 0) appears at the Now set for the realization of arbitrary prefix adders 1 row 1 marked 0 0 1 0 1 1 1 In for modern deep-sub-micron CMOS tec in gray. Noll: A prefix operator well suited forcan area-efficient brick-based adder the truth table of the “ • ”-operator in Tab.t ) ) means that input and output pairs of cell contains two mirror gates on top o i i Fig.34 3.(w. for Gi in the combination with a nand-ga prefix tree. the cell contains two mirro To exploit this a fact. It is evident that the implementation of the operato issimulation crucial for the adder efficiency. The results in combination with a nand-gate producing T . irrelevant carry generation due BBL Table 2. Surprisingly. an advantage of 15% in were state biased regarding Vth in a worst-case way by acceleratis replaced by the allowed state i :k Table ithe :k Application 1. ˜ ˜ imum possible increase must be determined. • • comparison rules) 1 1i 1Fig. Brent and H. ?? . power ation. G. The power dissipation of this ˜ ˜ the current (“ ”: − 15%. So the forbidden state mentioned net logic of the peripheral is taken into versions which could a transient occurrence the FS perator to output of preprocessing.26(1. Again. prefi Lay In modern deep-sub-micron CM Now a brick set for the realization of arbitrary is derived based on the presented modified prefix operator. Fig. 64-bit adder comparison (40-nm CMOS) by the is indicated dashed boxes t Energy per cycle/µW MHz 0. 1) and (1generation. using minimum-sized transistors. and due to the shareable supi Figure ?? shows comparison of “ ” ˜ The properties canthe be proved in analogy toimplemented “•”. i: i −1 to :i − 1 ) special ˜ ˜ rizedComparing in Tab. • comparison were much higher when applying the Vth variconditions are In switching conditions are tional version only suffered an increase by“best-case” 0. Therefore. (8) ˜ irandom ˜ti :·it ( G = .adv-radio-sci. 1) and ( 1 .t) set (ˆ g. BSIM version 4. 2000). t = ( g + t · g ˆ . T = 1) which is equivalent regarding carry in (Bernstein. In this case. results summaTo determine which maximum supply currents can be ex( ? ). 0 ) . of wiring d be transmitted from i − 1 . • ::is comparison of is layouts (MS rules). while the conveng t g G T modified “ ”.shareable To accommodate this fact. aforeAs aand first step.. This isas a possible. were the power th To determine which maximum supply currents can be exexhibits a 21% higher maximum VDD current than the one is one for the modified operator. The simulations were made th This makes it necessary to 1. Therefore. 5. to be used as a replacement for th g properties – “ ” must be idempotent 4 Derivation of brick to the fact that the carry is already generated because of by the allowed state 1the 1 0 of 0 0 1 1 1 CMOS gate under influence glitching. To show the effect of supply tree. ( 0 . Comparing 1. the conventional adder was simulated using cell mirror on top each other as shown eter to model the effects of V variability. resulting in the forbidden state 1 signal 0 boxes 0 1it. special stimuli which favor its 13. the difference infact. stimuli which favor its transient generation. . So the forbidden state µ This parameter was used to model aggressive variability sce-for Table 1. T the ) can kept at a minimum. mod Fig.33µm. to used as a replacement for i : i − 1 i : i − 1 product (EDP).9) Energy per Cycle / 0.g ˆ the + t ·t (8) on ’ ’ with the bit slice pitch from the ’ • ’-version was realshows that. of the “•”-operator in Tab. at least for the considered topology and transistor is crucial for the adder efficiency. Therefore. possibly leadisecond be fullfilled. • :: comparison of layouts (MS (M Fig. This is due to increased power dissipation same time slowing down transitions that are needed to exit ˜ ˜ The that the pair (G . 5. “ ”: − 9%). ::acomparison of layouts (MS rules).t) (g. denoted as proach.5%. generation. because carry would be generated at Now a brick set the realization of arbitrary prefix j → k + 2 : is derived based on the a presented modified prefix operator.g. The research results published in Bernstein et operator cells are cascaded horizontally. 2006) ed from weight i − 1.at a minimum. 1 ) (sharing this ply contacts of the gate depicted in Fig. Again. Rust and T. Compared to a regular were much higher when applying the “best-case” V vari-set the denser wiring of the BBL version.2(500. along paths between weights. 5. Fig. which observable Fig. K. BBL 0 erator cell which is implemented as efficiently pos In 1 modern deep-sub-micron CMOS technologies for which a ause a carry would be generated at max.t) (g.T i =CMOS 0) replaced by in the allowed state would be generated at Sklansky narios based on the layout-derived netlists of both adders. 1 1 0 1 1 1 1 1 1 294 I. Supply Current/mA 4. 2000). by the dashed boxes operator examples. forbidden to be usedstate as a replacement which a brick-based design style is advantageous. (?). In this case. for the process of drick. a modified prefix operator. Ifmeans the Therefore.0 1) (sharing this To determine the fraction ofis the power by as thefollows of powerthe dissipation caused by the FS. Ther Area /µm2 of 388.1 vs.0 1) (sharing this carry generation 0 0 the 0 modified 0 0 energy-delay-product (EDP). (G at a minimum. all transistors the netlists of minimum-sized both (e. so delays between corresponding G and T sigpeaks were increased by 12% (’ increased ’) and 7% •’).) /ps 1121 1128 shareable supplyConv. which strongly sugg adaption of k the cells to the layout of an op0 Table 1. 1assumed.2(500.60 0 of carry capacitance tosharing. To accommodate this fact. ’• ’:-9%). Therefore. 0). T i = 0 must never occur ˜ ˜ tages regarding area and regularity a – G for “ • ” exhibits the following properties ofbe the proposed version is only 0. A strong of supply current amplitudes can be a 21% higher increase maximum VDD current than the one ( occurring occurrence were simulated. 03 contacts 0 not 1 the 1 state 0 same 0 in Tab. 64-bit adder comparison (40-nm CMOS).g + t ·of t (8) Table 2. in the (MS conventional to the decreased ˜ Fig. T is derived ) can based kept at a minimum. To accommodate this the bit is crucial for the adder efficiency. two (BSI. ˆ t = (g + t ·of g. Fig. the re˜ ˜ brick-based style is advantageous. ˆ). and due to the sharea It is evident that the implementation of operator cell(s) . the power adder. The current Tab. therefore the maxinmarked the adder basedwhich on ’• ’ (4.truth were biased regarding V a worst-case way by acceleratth in sulting cell pitch ofto the proposed version is only 0.

Stimulus (i) shows a notable relative increase in energy per cycle and peak current. and Pileggi.net/9/289/2011/ Adv..386 6.: Design of a Branch-Based 64-bit Carry-Select Adder in 0. 2005.com/article. Hersan.. 50.31. Frank. Tong. J. T. Weinberger.: Design Methodology for IC Manufacturability Based on Regular Logic-Bricks. 16–28. V. Pileggi.(ii) /µW MHz−1 3. 1997.wc 3.. D.. 2009. Proc. causes a high energy per cycle of the BBL-based adder in absolute terms. Pileggi. Pearson.4. K. 727417. 0.eecs. IEEE Transactions on Computers. Kheterpal. L. O. the area savings regarding the operator cell could exceed the 54% reduction based on minimum space rules in comparison to DFM rules. a modified version of the prefix operator introduced by Brent&Kung was presented. D.399 7.: Binary Adder Architectures for Cell-Based VLSI and Their Synthesis. Radio Sci. A. 1991. Haensch.. V.: Simplify to survive: prescriptive layouts ensure profitable scaling to 32 nm and beyond. Arm. 433–449. 353–358. Nassif.528 295 Manufacturability and Design-for-Yield is increasingly important. B.. A. Horowitz. To carry out an in-depth comparison of the proposed operator with the classical one from Brent&Kung. L. 2007. PhD thesis.. Brent.. 2002.: A Regular Layout for Parallel Adders. Jhaveri. D. Strojwas. and Piguet. of Technology. R.. Rust and T. B. V. J... using a 3-input mirror gate as a replacement for the classical aoi and oai gates. Patil. L. H.php%?id=32.. J.org/10. Schettler. J. Ho. A.55 0. 7275. A. and Ananthraman. V.. J. L. Hersan. Rovner.: Technology. Rovner.edu/∼bsim3/bsim4. and Pandini. L. a version of this adder based on the classical approach was also realized.. BBLnc Imax. Motiani.. last accessed: 30 March 2011. IEEE Custom Integrated Circuits Conference. 6.html.5. Considering the variability-limiting nature of the brick-style layout approach. In conclusion. 2007. the presented approach to prefix adders offers a highly efficient alternative implementation for nanometer-scale technologies where Design-forwww.1117/12.. Zurich. 2007..94 0. BSIM 4. Computer Arithmetic. 1958. but without any important increase when applying the worst-case Vth distribution and without much difference to the classical implementation. doi:10.. Journal of Micro/Nanolithography. and Northrop.. Proc.. C.: PDF CEO calls for restricted layouts. Both adders were comprehensively characterized and compared. and Rohrer.. V.18 um Partially Depleted SOI CMOS. ISLPED.. R.14 0. 2011 . D. Strojwas. T. V. Proc. Gattiker. L. R. Kluwer Academic Publishers.. 289–295. 25... pp. Rovner... C-31. Takegawa. L.(i) /mA Ecyc. 108–111. Flandre. A.: Deep-Submicron CMOS ICs. T.. Pileggi. but both parameters have low amplitudes. and Kung. Tong.and Power Supply-Independent Cell Library. Goering. G. IBM J. 2006.. L. Kheterpal.547 Conv. R.. Liebmann. 031011. showing that the ATE efficiency of the adder based on the proposed operator is 15% higher. This operator is usable for any prefix tree. 1982.: Exact Combinatorial Optimization Methods for Physical Design of Regular Logic Bricks.. 2007.: A Logic for High-Speed Addition. Proc. H. the magnitudes of energy per cycle as well as peak currents are perfectly acceptable. V. G. Adder comparison: worst-case stimuli. 3–12.5. IEEE Symposium on. L. K.08 0.394 6. D.. D. T. because the results are based on extremely pessimistic assumptions regarding variability.1109/ARITH. J.1117/1. Zimmermann. K. Motiani. R. E.814406.12 0. it can be expected that..541 Conv.I. J.502 9. Ludwig.. In addition. vol. Tong. 2009. Veendrick. Hibbeler. Noll: A modified prefix operator well suited for area-efficient brick-based adder implementations Table 3. ieeecomputersociety.2781583. C. 2000. Strojwas. Y. Pileggi. Taylor. References Bernstein.27 0. Jhaveri.adv-radio-sci. Rovner. because the brickstyle approach could make the usage of restricted layout rules possible.0: Manual. W. berkeley. S.541 BBLwc 5. D. 2006. 6 Conclusions To improve the realization of arbitrary prefix-based carrypropagate adders in a brick-based design style.: Design Methodology of Regular Logic Bricks for Robust Integrated Circuits. While the supply current peaks are considerably increased in the BBL-based adder when passing over to the worst-case Vth distribution.. It was shown that this gate is very well suited for realization as a brick.1–25. K. A. Jhaveri. in: SPIE Conference Series.org/web/20071201153818/www. We have also shown that the forbidden state inherent to the modified prefix operator does not significantly increase power dissipation or current peaks. V.(i) /µW MHz−1 Imax. M. and Kheterpal.archive.. MEMS and MOEMS. Y.. and Smith. Nowak. J. doi:http://doi. H.: High-performance CMOS variability in the 65-nm regime and beyond.. D. Y. 9. T. DAC..5. T. a complete brick set for prefix adder implementations was created and used to implement a 64-bit Sklansky adder.. D. in reality.: Robust Energy-Efficient Adder Topologies. Motiani. the increase in power dissipation caused by the forbiddden state will be much lower compared to the presented values. 2005. Hersan.17 0.. available at: http://web.: OPC simplification and mask cost reduction using regular design fabrics.: Maximization of layout printability/manufacturability by extreme layout regularity.. Rovner. Dev. pp. National Bureau of Standards Circular 591.. and Pandini..2007. 260–264.nc 3. N. available at: http://www-device. last accessed: 30 March 2011. Azizi. R..(ii) /mA Ecyc. Stimulus (ii).33 0. on the other hand. G. ICCD.. Kheterpal. Neve. Ji. E.. Based on the resulting operator layout. and Pileggi. Masgonty. scdsource.. Res.. Swiss Federal Inst. pp. doi:10. 42.. J. DAC. SPIE. 7274. T. V. and Hellner.

Sign up to vote on this title
UsefulNot useful