You are on page 1of 6

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO.

9, SEPTEMBER 2003

1203

An SRAM Array Based on a Four-Transistor CMOS SRAM Cell


Stephan De Beer, Monuko du Plessis, and Evert Seevinck
AbstractThe static random acces memory (SRAM) array discussed in this brief is based on a four-transistor SRAM cell. A new method of writing the cell together with an associated array structure is proposed. The advantages are a significant reduction in power and an increase in cell reliability over previous designs. The noise margin of the cell under various conditions is investigated, as this is an effective method of designing the control mechanism of the cell. Index TermsReduced-area static random acces memory (SRAM), static noise margin.

I. INTRODUCTION When considering a standard six-transistor static random acces memory (SRAM) cell, one of the design issues is the fact that the access transistors may not be too strong, else the state of the cell may accidentally be modified during the initial read phase, where both bit lines are at a high potential. To overcome this, the access devices are typically made long and therefore occupy a significant percentage of the cell area. They do not, however, contribute toward the memory function, which resides purely in the cross-coupled inverter pair. If a different method of access is used, the access devices can be omitted. A previously proposed four-transistor SRAM cell exploits this by using the source nodes of the transistors making up the cross-coupled inverter pair to achieve access [1]. The proposed cell is shown in Fig. 1. 1) Reading the Four-Transistor SRAM Cell: The cell can be read by varying any of the four possible nodes (N1, N2, P1, P2) away from the supply voltage and beyond the threshold voltage of the devices, and then monitoring the current in the opposite inverter. For example, consider node V1 to be low and therefore node V2 to be high. Devices M1 and M4 are turned on in the linear region, and devices M2 and M3 are in cutoff. If the voltage at node N1 is raised, the voltage of the node V1 will track that of N1 because M1 is in a low-impedance mode. If the voltage deviation is larger than the threshold voltage of M3 but smaller than the trigger voltage of inverter M3/M4, then the device M3 will be driven into saturation mode and therefore conduct a current. This current may be sensed either at node N2 or P2. If however, node V1 is high and node V2 therefore low, then M1 is in cutoff. In this case, raising the voltage at node N1 cannot turn M1 on, so no conditions in the other parts of the circuit are changed. A current sensor attached to either node N2 or P2 would therefore sense no current. The presence of a current is defined as one logic state and the absence of a current as the other state. 2) Writing the Four-Transistor SRAM Cell: If an applied voltage deviation is large enough to force an internal node V1 or V2 beyond
Manuscript received September 27, 2001; revised March 11, 2003. This paper was recommended by Associate Editor S. Venkatraman. S. De Beer was with the Carl and Emily Fuchs Institute for Microelectronics, Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0002, South Africa. He is now with South African Micro-Electronic Systems (SAMES), Pretoria 0153, South Africa (e-mail: s_debeer@sames.co.za). M. du Plessis is with the Carl and Emily Fuchs Institute for Mircoelectronics, Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0002, South Africa. E. Seevinck was with the Carl and Emily Fuchs Institute for Mircoelectronics, Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Pretoria 0002, South Africa. He is now with Circuit Research International, 6522 Nijmegen, The Netherlands. Digital Object Identifier 10.1109/TCSI.2003.816316 Fig. 1. Four-transistor SRAM cell.

the trigger voltage of the opposite inverter, the state of the cell can be changed. The general scheme of writing cells in an SRAM array is to apply the data to all cells and then to select which cells to write. Here, the selection is done by reducing the supply voltage of the cells to be written, so that the trigger voltages of the inverters are shifted. The power supply can be reduced by either lowering the voltage of both P1 and P2 or by raising that of both N1 and N2. The data is applied to the cell by deviating one of the remaining nodes, N1 or N2, or P1 or P2, respectively. The logic state that needs to be written to the cell dictates which one of the remaining nodes is used for this. Consider that the power-supply reduction is achieved by lowering nodes P1 and P2. This lowers the trigger voltage of the cross-coupled inverter pair. The voltage of node N1 is now raised. If the initial state of the cell is such that node V1 is low, then, M1 is in the linear region and the voltage of node N1 will appear at node V1. If this voltage is larger than the reduced trigger voltage of the inverter M3/M4, the state of the cell will change. In the case where the initial state of V1 is high, the deviation of N1 does not affect the cell because M1 is in cutoff. The reduced trigger voltage requires a smaller deviation at node N1 to create the necessary write condition, therefore the reduction in power supply may be used to determine which cells are written. This will work as long as the deviation of node N1 is not large enough to write cells with full power supply, but is large enough to write those with reduced power supply. 3) Array Structure: In order for the four-transistor SRAM cell to be useful an array of cells has to be formed. A useful structure, as illustrated by means of a 2 2 2 array, is shown in Fig. 2. The four cells are denoted by Cell00 and Cell01 for bit 0 and bit 1 of word 0, respectively, and Cell10 and Cell11 for bit 0 and bit 1 of word 1, respectively. A word may be read by lowering the voltage of the relevant RW line and monitoring the current in the IBO lines. When writing a word, it is selected by lowering the voltage of the relevant RW and W lines. The data to be written to the cells is placed onto the I and IBO lines by raising one of them according to the state of the data that needs to be written. A problem associated with this array is the fact that when one word is being written all others in the array are read, by a deviation in either the I or IBO lines. In this brief, several aspects of the four-transistor cell are explored. A different method of writing the cell is proposed, with the aim of improving the reliability of the cell as process conditions change. The new write method requires a different array structure and this is subsequently discussed. The noise margin under read and write conditions is

1057-7122/03$17.00 2003 IEEE

1204

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO. 9, SEPTEMBER 2003

(a)

Fig. 2. Illustration of the previously proposed array structure based on a 2 array of cells.

22

evaluated and used as an effective vehicle in designing the magnitude of the deviations that need to be applied to the source nodes in order to achieve successful reads and write operations. II. IMPROVING CELL OPERATION A. Limitations of the Previous Proposal One definite disadvantage of the four-transistor SRAM cell array proposed previously [1] is the array power dissipation. When one word is being written, the data is applied to all cells in the array, in the form of a raised voltage on either node N1 or N2. This causes those cells without a power-supply reduction to be read and therefore potentially conduct read currents, the wasted write currents. Because these currents waste power, a new array structure should aim to both limit their occurrence and reduce their magnitude. Basic simulations of the four-transistor cell across all process corners in the Austria Mikro Systeme 0.6-m CMOS process [2] also reveal several shortcomings as the process conditions vary. The following problem areas may be identified by simulating a single cell. The previous proposal [1] suggests all deviations should be in the order of 1.5 V if a typical CMOS process and a 5-V supply are assumed. This leaves one of the inverters with a supply voltage of 2 V during the write cycle. This is adequate considering that the threshold voltage is in the order of 0.8 V. If the worst-case speed model is used for simulation, the deviations need to be increased, because the threshold voltages increase, but this leaves one of the inverters with no power-supply headroom. The body effect present in all devices where the source node is deviated, adds to this problem by further increasing the threshold voltage. The switching now takes place by means of subthreshold conduction, severely slowing down the write cycle. When using the worst-case power model, the low-threshold voltages give rise to very large wasted write currents due to the high overvoltage. This leads to excessively high array power dissipation. When using the worst-case one model, the balance between the device strength of the NMOS and PMOS devices is disturbed. This lowers the trigger voltages of the inverters to such an extent that a 1.5-V deviation of a single NMOS source node is sufficient to flip the state of the cell. Applying the data to be written to a selected word therefore causes all cells in the array to be written. The raised inverter trigger voltage present when using the worst-case zero model causes the 1.5-V deviation applied to one of the NMOS source nodes to be insufficient to write the cell, even when the power supply is reduced.
(b) Fig. 3. Static write conditions for (a) the previously proposed write method and (b) the alternative write method.

These simulation results indicate that correct operation of the cell is not possible across all process variations, if the magnitude of the voltage deviations applied to the cells are kept constant. Experiments performed on a 2 2 2 array of cells confirm these findings. The cell array is operational, but very sensitive to the magnitude of the deviations. This can be considered to be an indication of small noise margins and causes reliability issues. B. Alternative Write Method Writing a cross-coupled inverter pair is based on modifying the individual inverter characteristics, so that only one stable operating point remains [3]. In the previously proposed four-transistor SRAM cell, this is achieved by applying a deviation to one of the NMOS source nodes and reducing the power supply of both inverters. This is shown in Fig. 3(a) with the stable operating point at A. The severely reduced power supply of inverter M1/M2 is evident. To increase the reliability of the write cycle, a different approach should be used. Consider that in Fig. 1, a single NMOS source node N1, and the PMOS source node of the opposite inverter P2, are deviated. One inverter now has a raised trigger voltage and that of the other is lowered. These two effects work together to ease the establishment of only a single operating point, as shown in Fig. 3(b), with the stable operating point at A. The previously proposed scheme lowered one trigger voltage, while leaving the other unchanged, because lowering the PMOS source and raising the NMOS source of a single inverter by equal amounts, leaves the trigger voltage virtually unchanged. An advantage of the proposed scheme is that each inverter is affected by only a single power-supply reduction, and large overvoltages are preserved. This can be clearly seen in Fig. 3(b), where the power supply of inverter M1/M2 is no longer severely reduced. This prevents situations where devices conduct in or close to weak inversion and therefore allows the state of the cell to change quickly. Additionally, a lower NMOS source node deviation is now sufficient to create adequate static write conditions when writing the cell. This smaller deviation not only reduces the magnitude of the wasted write currents and helps to achieve

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO. 9, SEPTEMBER 2003

1205

lower array power dissipation during the write cycle, but also aids in preserving high overvoltages on all conducting devices. A significant disadvantage is however also present. The deviations of both the PMOS and NMOS nodes are data dependent. It is therefore no longer possible to select a complete row of cells and write both binary values in one step. A row can be selected and certain cells can be written to one binary value. The row may then be selected using the other PMOS node and the remaining cells may be written to the other binary value. Alternatively, a scheme could be devised to set all cells in a row to the same value and then use the proposed write method to set certain cells to the opposite binary value. Whichever scheme is used, the write cycle becomes a two-phase procedure, which will require more time to complete and more complex control mechanisms to implement. High speed is usually a critical specification for SRAM systems, so the two-cycle write operation is a drawback, even though the state of the cell can be changed faster given the increased power supply to the inverters during the write cycle. Successful simulation of a cell using all worst-case models proves that the cell is functional across all process corners. This is achievable even if the deviations are kept constant. A set which works well is a PMOS source deviation of 1.8 V and an NMOS source deviation of 1 V. III. ARRAY STRUCTURE The four-transistor SRAM cell is only useful if it is possible to create an array of cells. To accommodate the newly proposed write mechanism the previously proposed array structure [1] needs to be modified. As mentioned above, the write cycle has to be structured as two separate subcycles. Two possibilities exist, namely to first write all the cells of the word that must be set to one logic state and then write the remaining cells to the opposite logic state or all the cells of one word can be set to one logic state and then selectively written with the opposite state. Setting all the cells of one word to a specified state can be accomplished by pulling a node of the cell to the opposite supply. Whether to choose the NMOS or PMOS node depends on the design of the inverters. For a small area, it is typically desired to design all cell transistors minimum size. In this case, the trigger voltage of an inverter is in the region of 2 V, because the NMOS is a better device than the PMOS. It is therefore advantageous to use an NMOS source node, because with the trigger voltage is closer to VSS than to VDD, static write conditions can be established at a smaller node voltage deviation. This means that the static write conditions are achieved at a higher power supply to the inverters, increasing the switching speed. The first possibility mentioned above to write the array has two cycles where a voltage deviation is potentially applied to all cells in the array, while the second has only one such cycle because setting all the cells of one word to a known value can be accomplished without affecting other cells in the array. The second method was therefore chosen in an effort to limit the occurrence of the wasted write currents. The 2 2 2-array structure of Fig. 4 shows how an array of cells can be implemented. A row of cells can be placed in a specific state by pulling the associated CL line to the power supply (VDD). The cells are thereby forced into a state where M1 and M4 of Fig. 1 are on and thus V1 is low. This state is defined as a logic zero. After this, certain cells in a row may be placed in a logic one state by lowering the voltage on the RW line and raising the voltage on specific DIO lines. Doing this will cause other cells in other rows of the array to have their DIO-line voltage raised, and these will conduct a wasted write current if they are in the low state. Reading a word is accomplished by lowering the voltage on the RW line. This causes a current to flow in the DIO line if the cell connected to that line is in a logic zero state. Otherwise, no current will flow.

Fig. 4. Illustration of the newly proposed array structure based on a 2 of cells.

2 2 array

Compared to the array previously proposed, this implementation has several advantages. Functional operation is now possible across all process corners using a constant set of node deviations. This indicates greater reliability of the system. This array requires routing of five lines per cell (CL, RW, DIO and VDD, VSS) whereas the previous proposal required six lines to be routed (RW, W, I, IOB, VDD, VSS). Given the low resistance of the p substrate in typical submicron CMOS processes the VSS line could be omitted in both cases, but the VDD line in the previous proposal should not be omitted because of the high resistance of the n well. Such a scenario would change the number of lines that need to be routed to each cell to four and five, respectively. Routing fewer lines should reduce the cell size. The wasted power during the write cycle is significantly reduced by two mechanisms. First, it is possible to use smaller DIO line deviations. A nominal DIO line deviation of 1 V can be used instead of the 1.5-V deviation previously proposed. This lowers the wasted current from 80 to 20 A per cell. Second, under the assumption of equal probability data, half the bits of a word will be written by raising the voltage of half the DIO lines in the array. Of the cells connected to these lines one half will be in a state causing them to be parasitically read. This means that only one quarter of all cells in the array conduct wasted write current. When considering the previously proposed scheme, one half of all cells in the array will waste power during the write cycle because each cell has a deviation in either the I line or the IOB line applied to it. For example, for a 1024 2 32 array the wasted write current is reduced from 1.31 A to 163 mA when using the typical mean simulation model, a reduction of 87.5%. This reduction however is based on data probabilities. When considering the worst-case data scenario, that is when all cells in the array conduct a wasted write current the magnitude of this current is reduced from 2.62 A to 655 mA. This is a reduction of 75%, once again based on the typical mean model. This reduction is a direct result of the smaller NMOS source node deviation. The only disadvantage present when comparing to the previously proposed array structure is the need for a two-cycle write operation. The proposed new write method, associated array structure and set of deviations were verified experimentally by manufacturing a 2 2 2 array of cells in the Austria Mikro Systeme 0.6-m CMOS process. The array was found to operate correctly, and compared to the previously proposed write method and array structure it was found to be less sensitive to the precise value of the deviations. This indicates a larger noise margin and greater reliability. The two-cycle write method

1206

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO. 9, SEPTEMBER 2003

(a)

In one set the PMOS node is lowered in steps and in the second set the NMOS node is raised in steps. The sets of transfer characteristics are generated using a circuit simulator and the models supplied by the manufacturer [2]. One transfer characteristic of each set is used in the noise margin calculation algorithm. The deviation of the RW line is termed Y and that of the DIO line on the opposite inverter X . The noise margin as a function of Y , while X is zero [Fig. 6(a)] and the noise margin as a function of X , while Y is zero [Fig. 6(b)] have been calculated. The first situation indicates the noise margin of a cell while only the RW line is deviated away from the supply by an amount Y , that is the cell is being read. The noise margin as a function of X , while Y is zero, is the noise margin of the cell while only the DIO line voltage is raised, that is the cell is being parasitically read because other cells in the array are being written. A set of (X; Y ) points where the static noise margin is zero has also been found. These points define the zero static noise margin boundary and therefore the set of minimum source node deviations required to achieve static write conditions. This is given in Fig. 6(c) for all process corners. These three figures can be used together to design suitable magnitudes for the RW line- and DIO line-voltage deviations.

V. VOLTAGE-DEVIATION DESIGN In order to design the source node deviations, it is required to consider all three plots of Fig. 6 together. In general, it is desired to keep the noise margin during read equal to the noise margin during the parasitic read, because the smaller of the two will be the noise margin of the cell. The method of design could therefore be to choose X and Y using Fig. 6(a) and (b), respectively, such that the two noise margins are equal. A second constraint that has to be satisfied is that the selected X and Y deviations together have to create static write conditions. If the selected point is plotted on Fig. 6(c), it should lie on or above the zero noise margin line. If the selected point lies on the zero noise margin line the cell is on the verge of being written. To ensure reliability of the write cycle the selected point should lie above the zero noise margin line to guarantee that static write conditions will still exist under all conditions. Designing the deviations therefore necessitates finding a set that yields large and equal noise margins as well as static write conditions. Using the three graphs, the following deviation scheme was devised. The standard design point for the deviations is X = 1 and Y = 1.8 V. This was selected because the static write conditions are achieved for all process conditions at a low X deviation and an acceptable Y deviation. Equal noise margins of 0.6 V are achieved for the typical mean case. The selected point also lies at least 0.1 V beyond any zero noise margin line, thereby introducing a safety margin of 0.1 V for the static write conditions. A reason for the lower power consumption can be seen here. The X deviation has been reduced from 1.5 V in the previous proposal, to 1.0 V. This is an important factor in reducing the power consumption due to wasted write currents, by reducing their magnitude. The read current is increased when comparing to the previous proposal because the Y deviation is increased from 1.5 to 1.8 V. This should simplify the current sense amplifier design without causing a severe power consumption penalty. Even though all noise margins change as the process conditions change, the chosen point guarantees operation across all conditions. It is however desirable to further improve this situation and the noise margin analysis design procedure shows a possible method. Referring to Fig. 6(b), it is advantageous to decrease the X deviation for the worst-case power and worst-case one situation, and increase it for the worst-case speed and worst-case zero situations. This is equivalent to

(b) Fig. 5. (a) Cross-coupled inverter pair with worst-case series-voltage noise sources inserted. (b) Graphical representation and derivation of the worst-case series-voltage noise margin.

is therefore considered to be a fair sacrifice given that cell operation is now more reliable. IV. STATIC-NOISE MARGIN The worst-case series-voltage noise margin of a cross-coupled inverter pair is defined as the dc voltage Vn , that can be tolerated, when applied as shown in Fig. 5(a), without upsetting the state of the cross-coupled inverter. It can be derived by superimposing the voltage transfer characteristics of the two inverters and finding the maximum square [4] as shown in Fig. 5(b). A simple algorithm to find the maximum square is to define a new  u; v coordinate system that is rotated 45 with respect to the original axes. The diagonal of the maximum square now lies parallel to the v axis. The transfer function points are translated to the new coordinate system and the v distance between the two curves is calculated as a function of u. The smaller of the maximum and minimum value of this distance is the length of the diagonal of the smaller maximum square. This, when translated back to the original coordinates is the worst-case static noise margin [5]. When any node of the four-transistor SRAM cell is deviated from the supply voltage, a reduction in noise margin takes place. This is a direct result of the modified inverter characteristics that occur when any node is deviated away from its supply. The two situations which need to be analyzed are the reduction in noise margin when: 1) a cell is being read, and 2) a different cell in the array is being written, causing a parasitic read. Further, it can be said that a zero noise margin implies that no external noise input is required to cause the cell to lose its current state. This is equivalent to static write conditions being present. For the four-transistor SRAM cell, when node voltage deviations are applied, the two inverter transfer characteristics differ. Two sets of several characteristics are therefore used in the noise margin analysis.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO. 9, SEPTEMBER 2003

1207

(a)

(b)

Fig. 6. (a) Noise margin as a function of for = 0 (cell read), (b) as a function of (cell write condition) for the different process corners.

(c)

X for Y = 0 (parasitic read), and (c) the zero noise margin trajectories

scaling the X deviation depending on the quality of the NMOS transistor. This decreases the spread on the noise margin and, importantly, counters the low noise margin of the worst-case one situation. When considering the Y deviation, applying similar scaling that is dependent on the PMOS device quality achieves similar results . Both deviation scaling schemes also combine to increase the write safety margin for the worst-case speed model, and reduce the excessive safety margin associated with the worst-case power model. Considering that this scheme of scaling the deviations as a function of device quality will stabilize the overvoltage as the process conditions change, it also reduces the spread of the currents flowing in the opposite inverter to the one where a specified single deviation is being applied. This is especially true for the X deviation, where a spread of 60 A on the wasted write currents can be reduced to 25 A by designing for a variation of 0.15 V around 1 V as the quality of the NMOS device changes. The spread on the read current can also be reduced by varying the Y deviation by 0.2 V around 1.8 V as the quality of the PMOS device changes. This once again saves power but, more significantly, raises the minimum current that needs to be detected and lowers the maximum current, which can potentially further reduce the complexity of the current sense amplifier. The variation in the X deviation compensates for quality variations of the NMOS, and that of the Y deviation compensates the PMOS device. These variations may therefore be generated using the device in question as a reference. If the deviations are generated using the threshold voltage of the respective device as a reference, a change in

TABLE I COMPARISON BETWEEN THE PREVIOUS AND NEW WRITE METHOD AND ARRAY STRUCTURE PROPOSALS

device quality which is largely due to a change in the threshold voltage, will produce the correct change in the voltage deviation. The currents flowing in the four-transistor SRAM cell during operation will cause voltage drops across the internal resistance of the source node drivers as well as the interconnect. These voltage drops will increase the effective deviations applied to the cells and thereby decrease the noise margins. Therefore, adequate noise margins are required, necessitating a large supply voltage. This fact could make the four-transistor SRAM cell difficult to use in low-voltage applications and could thereby also limit its usefulness in deep submicron technologies.

1208

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 50, NO. 9, SEPTEMBER 2003

VI. CONCLUSION This brief introduced modifications and advancements to a previously proposed four-transistor SRAM cell array. A new write method where the advantages far outweigh the disadvantages has been used in conjunction with an alternative array structure to obtain improved immunity to process conditions. A design method for the deviations that need to be applied to the source nodes in order to effect control, based on noise margins analysis, has been introduced. The newly proposed write method together with its array structure is compared to the previous proposal shown in Table I. All specifications given were simulated using the typical mean model. REFERENCES
[1] T.-H. Joubert, E. Seevinck, and M. du Plessis, A CMOS reduced-area SRAM cell, in Proc. IEEE Int. Symp. Circuits and Systems ISCAS00, Geneva, Switzerland, May 2831, 2000, pp. III-335III-338. [2] Austria Mikro Systems, 0.6 mm CMOS CUP process parameters, Austria Mikro Systems International AG, Surrey, U.K, Doc. 9933011, Rev. B, Oct. 1998. [3] K. Anami et al., Design considerations of a static memory cell, IEEE J. Solid-State Circuits, vol. 18, pp. 414417, Aug. 1983. [4] E. Seevinck, F. List, and J. Lohstroh, Static-noise margin analysis of MOS SRAM cells, IEEE J. Solid-State Circuits, vol. 22, pp. 748754, Oct. 1987. [5] J. Lohstroh, E. Seevinck, and J. De Groot, Worst-case static noise margin criteria for logic circuits and their mathematical equivalence, IEEE J. Solid-State Circuits, vol. 18, pp. 803807, Dec. 1983. Fig. 1. Schematic diagram of the prior ANT logic. TABLE I SIZES OF ANT LOGIC BLOCK SHOWN IN FIG. 1

A 1.25 GHz 32-Bit Tree-Structured Carry Lookahead Adder Using Modified ANT Logic
Chua-Chin Wang, Yih-Long Tseng, Po-Ming Lee, Rong-Chin Lee, and Chenn-Jung Huang
AbstractIn this brief, a 32-bit tree-structured carry lookahead adder (CLA) is proposed by using the modified all-N-transistor (ANT) design. The 32-bit CLA not only possesses few transistor count, but also occupies small area size. Moreover, the post-layout simulation results given by TimeMill show that the clock used in this 32-bit CLA can run up to 1.25 GHz at 3.3-V power supply. The output of the proposed CLA will be ready after 3.5 cycles. The proposed circuit is also easy to be expanded for long data additions. A physical chip is fabricated to verify the proposed circuit on silicon. Index TermsTree-structured carry lookahead adder (CLA), cell, all-N-transistor (ANT).

proving adder designs have received considerable attention [1][6]. CMOS dynamic logic is one of the promising options to challenge GHz operation regarding adder designs [5]. However, domino logic [1] cannot be noninverting; an adder in [4] can only process short data words; all-N-logic [5] and robust single phase clocking [6] cannot operate correctly under clocks with short rise time or fall time; singlephase logic [2] and Zipper CMOS [3] contain slow P-logic blocks. An all-N-transistor (ANT) noninverting function block was proposed to resolve the mentioned difficulties [7], which can be used to design a highspeed carry lookahead adder (CLA). However, a drawback of the programmable logic array (PLA)-styled CLA in [7] is that it can hardly run a pipelining operation owing to the demand of generating a back propagation signal. In this brief, we enhance the ANT logic by proposing an o cell and a tree-structured scheme for CLAs. The simulation results of the proposed design are also given to prove its high-speed performance. A physical chip is implemented by the Taiwan Semiconductor Manufacturing Company (TSMC) 0.35-m 1P4M CMOS technology to prove the function of the proposed CLA design. The output is available with a total of 3.5 cycles of delay after the input data are fed. II. 32-BIT TREE-STRUCTURED CLA A. Prior ANT Function Unit An ANT logic is presented in Fig. 1. The main feature of this design is the presence of the feedback transistor pair, P3 and N3, between the evaluation block and the output. P3 and N3, respectively, provide an extra charging path and discharging path, thus accelerating the evaluation. Notably, the ANT in Fig. 1 is noninverting. The detailed operations of the ANT are described as follows [7]. 1) When clk = 0, P1 is on and the gate of P2 is precharged to be Vdd . Then, P2 is off and N4 is off. This makes the output to stay at the previous state.

I. INTRODUCTION The high-speed operation has long been a target of circuit designers owing to the speed demand of supercomputing, CPUs, etc. Hence, imManuscript received March 20, 2001; revised November 20, 2001 and December 12, 2002. This work was supported in part by the National Science Council under Grant NSC 89-2218-E-110-014 and Grant 89-2218-E-110-015. This paper was recommended by Associate Editor Y. Ismail. C.-C. Wang, Y.-L. Tseng , P.-M. Lee, and R.-C. Lee are with the Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung 80424, Taiwan, R.O.C. (e-mail: ccwang@ee.nsysu.edu.tw). C.-J. Huang is with the Department of Computer Science and Information Education, National Taitung Teachers College, Taitung 95004, Taiwan, R.O.C. (e-mail: cjh@cc.ntttc.edu.tw). Digital Object Identifier 10.1109/TCSI.2003.816339

1057-7122/03$17.00 2003 IEEE

You might also like