You are on page 1of 50


Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The first semiconductor chips held two transistors. Subsequent advances added more and more transistors, and, as a consequence, more individual functions or systems were integrated over time. The first integrated circuits held only a few devices, perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or more logic gates on a single device. Now known retrospectively as small-scale integration (SSI), improvements in technique led to devices with hundreds of logic gates, known as medium-scale integration (MSI). Further improvements led to large-scale integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has moved far past this mark and today's microprocessors have many millions of gates and billions of individual transistors. At one time, there was an effort to name and calibrate various levels of large-scale integration above VLSI. Terms like ultra-large-scale integration (ULSI) were used. But the huge number of gates and transistors available on common devices has rendered such fine distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in widespread use. As of early 2008, billion-transistor processors are commercially available. This is expected to become more commonplace as semiconductor fabrication moves from the current generation of 65 nm processes to the next 45 nm generations. Current designs, as opposed to the earliest devices, use extensive design automation and automated logic synthesis to lay out the transistors, enabling higher levels of complexity in the resulting logic functionality. Certain high-performance logic blocks like the SRAM (Static Random Access Memory) cell, however, are still designed by hand to ensure the highest efficiency. 1.2 VLSI DESIGN FLOW

The system prototyping methodology is a natural outgrowth of recent developments in Software and hardware facilities intended to make it simple for designers with an idea for a particular application to turn that idea into a working system based on very large scale Integrated chips. Today VLSI CMOS technologies deliver individual integrated circuits (ICs) and containing millions of gates, sufficient to implement substantial systems-on-chip or major subsystems-on-a chip. System-on-chip design may involve the expertise from many fields of electronics such as signal processing, communication, device physics etc. and so on. It is unreasonable to expect the architect of a speech recognition system, for example, to be an expert in device physics as well as in signal processing. The Mead Conway methodology for integrated-circuit design makes VLSI technology available to such an application designers. The structured design methodology of Mead and Conway is an approach to VLSI system design that attacks the problems of complex chip designs. The structured design methodology is similar in concept to structured programming: the design proceeds in a topdown manner in which the problem is decomposed and refined. The structured design methodology has two major parts: hierarchy and regularity. Hierarchical techniques have long been used to design complex systems. Hierarchies are used to partition designs and common parts of a design can be factored out and specified only once. By introducing regularity into a system, the design problem is reduced in complexity as subunits are replicated many times and connections between units are simplified. Regularity means that the hierarchical decomposition of a large system should result in not only simple, but also similar blocks, as much as possible. A good example of regularity is the design of array structures consisting of identical cells such as a parallel multiplication array. Regularity can exist at all levels of abstraction: If the designer has a small library of well-defined and well-characterized basic building blocks, a number of different functions can be constructed by using this principle. Regularity usually reduces the number of different modules that need to be designed and verified, at all levels of abstraction.


VLSI design style mainly uses three domains of design description, viz. the behavioral, the description of the function of the design; the structural, the description of the 2

form of the implementation; and the physical, the description of the physical implementation of the design. There are many possible representations of a circuit in each description, and judicious choice of representations is important in tool design. A simplified view of design flow is shown in Fig. 1. Regardless of the actual size of the project, the basic principles of structured design will improve the prospects of success. System Specification Functional (Architecture) Design


Functional Verification Logic Design

Logic(Gate Level Representation) Logic Verification

Circuit Design Circuit Modeling Circuit Representation Circuit Verification

Physical Design Layout Representation

Layout Verification

Fabrication and Testing Fig1.1 VLSI design flow At the beginning of a design it is important to specify the system requirements without unduly restricting the design. The object is to describe the purpose of the design including all aspects, such as the functions to the realized, timing constraints, and power dissipation requirements, etc. 3

Functional design specifies the functional relationships among subunits or registers. In general, a description of the IC in either the functional or the block diagram domain consists both of the input-output description, and the way that this behavior is to be realized in terms of subordinate modules. In turn each of these modules is described both in terms of input-output behaviors and as an interconnection of other modules. These hierarchical ideas apply to all the domains. The internal description of a module any be given in another domain. If a module has no internal description then the design is incomplete. Ultimately this hierarchy stops when the internal description is in terms of mask geometry, which is primitive. Hierarchy and modularity are used in block diagrams or computer programs. In these domains hierarchy suppresses unnecessary details, simplifies system design through a divide-and-conquer strategy and leads to more easily understood designs that are more readily debugged and documented. It can be summarized in a way that when we want to design a digital system, we need to specify the system performance which is called system specification. Then the system must be broken down into subunits or registers. So we have a functional design which specifies the functional relationships among subunits or registers. Architecture usually means the functional design, system specification and often including part of the subsequent logic design. The next step is the Logic design of networks which constitutes subunits or registers. When a system architecture or logic networks are designed, performance and errors are checked by CAD programs, called as logic simulation. The subject of the logic design is to decide overall structure of blocks, their interconnection pattern, to specify the structure of data path and to control sequences of data path. Logic simulator does the logic verification considering the propagation delays of interconnection signals and the element delay. Simulator also checks whether the network contains hazards analysis. Logic design and simulation is a key issue in VLSI CAD. The flow of logic design process is determined by the level at which the design can begin-system level, behavioral level or functional level. Logic design consists of a series of design steps leading from a higher level to a circuit description at the logic level. For this electronic Circuit design and simulation, CAD programs perform complex numerical analysis calculations of nonlinear differential equations which characterize electronic circuits. Since we need to finish calculation within a reasonable time limit, keeping the required accuracy, many advanced numerical analysis techniques are used. The 4

CAD programs usually yield the analysis of transient behavior, direct-current performance, stationary alternating-current performance, temperature, signal distortion, noise interference, sensitivity and parameter optimization of the electronic circuits. The Layout system is used to convert block/cell placement data into actual locations, and to construct a routing maze containing all spacing rules. The format used for relative cell placement data is the same for automatic as for manual placements in order to simplify their interchange. In fact, the output of the automatic placement program can be modified by hand before input into the chip building step as manual placement data. The layout for random-logic networks in the most time-consuming stage throughout the entire sequence of LSI/VLSI chip design. After having finished the layout, designers usually check by CAD programs whether the layout conforms to the layout rules. As the integration size of LSI/VLSI chips becomes larger, design verification and testing at each design stage is vitally important, because any errors which sneak in from the previous design stages are more difficult to find and more expensive, since once found, we need to redo the previous design stages. As the integration size increases, the test time increases very rapidly, so it is crucial to find a good way to test within as short a time as possible, though it appears very difficult to find good solutions. Complete test and design verification with software or hardware (i.e., computers specialized in testing) is usually done to find a design mistake. The last domain in which the design of an IC can exist include the mask set, and of course, the final fabricated chip followed by prototype testing.

MEMORY refers to the physical devices used to store programs or data on a temporary or permanent basis for use in a computer or other digital electronic device. The term primary memory is used for the information in physical systems which are fast (i.e. RAM), as a distinction from secondary memory, which are physical devices

for program and data storage which are slow to access but offer higher memory capacity. Primary memory stored on secondary memory is called "virtual memory". The term "storage" is often used in separate computers of traditional secondary memory such as tape, magnetic disks and optical discs (CD-ROM and DVD-ROM). The term "memory" is often associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices. There are two main types of semiconductor memory: volatile and non-volatile. Examples of non-volatile memory are flash memory and ROM, PROM, EPROM, EEPROM memory. Examples of volatile memory are primary memory (dynamic RAM), and fast CPU cache memory (static RAM), which is fast but energy-consuming and offer lower memory capacity per area unit than DRAM. The semiconductor memory is organized into memory cells or bistable flip-flops, each storing one binary bit (0 or 1). The memory cells are grouped into words of fix word length, for example 1, 2, 4, 8, 16, 32, 64 or 128 bit. Each word can be accessed by a binary address of N bit, making it possible to store 2 raised by N words in the memory. This implies that processor registers normally are not considered as memory, since they only store one word and do not include an addressing mechanism.


SRAM operates in three modes. They are; 1. Standby mode or idle mode 2. Read mode and 3. Write mode. In idle mode, the SRAM is disabled. In read mode the data is read from a selected address location. In write mode the data is written to a particular address location. 2.3 APPLICATIONS SRAM is used in many embedded applications. Many categories of industrial and scientific subsystems, automotive electronics uses static RAM. Some amount (kilobytes or less) is also embedded in practically all modern appliances, toys, etc., that implement an electronic user interface. Several megabytes is used in complex products such as digital cameras, cell phones, synthesizers etc. 6


This is one of the most common approaches to reduce leakage currents where two different types of transistors are fabricated on the chip, a high Vth to lower sub-threshold leakage current. Based on the multi-threshold technologies previously described, several multiple-threshold circuit design techniques have been developed. Multi-threshold voltage CMOS: reduces the leakage by inserting high-threshold devices in series to low Vth circuitry. Fig. 3.1(a) shows the schematic of an MTCMOS circuit. A sleep control scheme is introduced for efficient power management. In the active mode, Sleep is set low and sleep control high Vth transistors (MP and MN) are turned on. Since their on-resistances are small, the virtual supply voltages (Virtual Vdd and Virtual GND) almost function as real power lines. In the standby mode, Sleep is set high, MN and MP are turned off, and the leakage current is low. In fact, only one type of high Vth transistor is enough for leakage control. Fig 3.1(b) and (c) show the PMOS insertion and NMOS insertion schemes, respectively. The NMOS insertion scheme is preferable, since the NMOS on-resistance is smaller at the same width. NMOS can be sized smaller than corresponding PMOS. MTCMOS can be easily implemented based on existing circuits.

Fig 3.1 MTCMOS Technique In the active mode, the sleep control signal SL is set low and the control transistors are tuned on. Since the on resistances of high-Vt sleep is low, VDD and VSS act like power supply lines. In the standby mode, SL is set high, the high threshold sleep control transistors are tuned off, resulting in low leakage current. Short channel transistors require lower power supply levels to reduce power consumption. This forces a reduction in the threshold voltage that causes a substantial increase of weak inversion current. The leakage control technique that have been proposed so far is power gating and also known as MTCMOS, has traditionally been the most effective way to lower the leakage. Power gating uses a PMOS transistor or an NMOS transistor to disconnect the circuits supply voltage from the logic when the logic is inactive. This technique can reduce leakage by more than two orders of magnitude with negligible speed degradation. MTCMOS power gating works to reduce leakage currents by disconnecting the power supply from specific portions of a circuit when those portions are not needed. 9

Multi-threshold CMOS (MTCMOS) has been described as a method to reduce standby leakage current in the circuit, with the use of a high threshold MOS device to decouple the logic from the supply or ground during long idle periods, or sleep states. MTCMOS circuit, where the logic block is constructed using low threshold devices and the either the power supply can be gated by a high threshold header switch, or the ground terminal is gated by a high threshold footer switch. During active operation of the MTCMOS circuit described by Fig 3.2, the power interrupt switch is turned on by the SLEEPN (or SLEEPP) signal and current dissipated by the logic is drawn through the interrupt switch which causes a reduction in drive voltage seen by the logic, reducing logic performance. To compensate for the reduction in logic performance: Larger power supply voltages can be used to at the expense of increased active power for similar performance. Larger device widths for the power interrupt switch can be used to minimize performance impact, at the expense of increased area and power for entering and existing sleep mode. The adjustments in device implants to allow moderately high threshold values is another technique that can be used to increase performance at the expense of increased device leakage during idle mode.

Fig 3.2: MTCMOS Logic The MTCMOS scheme, proposed in, is a good technique for reducing both gate and sub-threshold leakages. But it slows down circuits considerably as VDD is scaled below 0.6V.



A principal source of Igate arises from the tunneling of electrons through the gate oxide. The probability of electron tunneling is a strong function of the applied electric field and the barrier thickness itself, which is simply Tox, with a small change in Tox having a tremendous impact on Igate. For example, in MOS devices with SiO2 gate oxides, a difference in Tox of only 2A can result in an order of magnitude increase in Igate ,so that reducing Tox from 18A to 12A increases Igate by approximately 10001. The most effective way to control Igate is through the use of new materials, high-k dielectrics, but such materials are not expected to be manufacturable until approximately 2007 at the earliest .The issue of power dissipation due to gate leakage a rises in two contexts. In the stand-by mode, when a circuit is not undergoing any active operations, leakage may be controlled through various means, prominent among which is the use of multiple threshold CMOS sleep transistors. The assignment of circuit inputs to send the circuit into a low leakage state, and body biasing. In the active mode, i.e., in normal operation, clearly, the use of neither sleep transistors nor state assignment is viable. Recent studies show that at the 90nm mode, leakage can contribute over 40% of the total power. Leakage power in modern CMOS VLSI circuits has become a component comparable to dynamic power dissipation. Typically, the sub threshold leakage current dominates the device off-state leakage due to low Vth transistors employed in logic cell blocks in order to maintain the circuit switching speed in spite of decreasing VDD levels. The Multi-Threshold CMOS (MTCMOS) technique can significantly reduce the sub threshold leakage currents during the circuit sleep (standby) mode by adding high-Vth power switches (sleep transistors) to low-Vth logic cell blocks. This is because the stacked high-Vth sleep transistor connected to the bottom of the pull-down network of all logic cells in the circuit acts as a high-resistance element during the sleep mode, which limits the leakage current from Vdd to ground lines. At the same time, because of the stack effect, the sub threshold leakage of the low-Vth transistors in the logic block itself goes down. This leakage reduction is preferably achieved with small performance degradation because, during the active mode of the circuit, the sleep transistor is fully on (i.e., it operates in the linear mode), and thus, all low-Vth logic cells in the MTCMOS logic block can switch very fast. Unfortunately, the situation is different in real designs. More precisely, during the active mode of the circuit operation, the high-Vth sleep transistor acts as a small linear resistance placed at the bottom of the transistor stack to ground, causing the propagation delay of the cells in the logic block to increase. In addition, the virtual ground network itself acts as a distributed RC network, which causes the voltage of the virtual ground node to rise even further, thereby degrading the switching speed of the logic cells 11

even more .The former effect is a function of the size of the sleep transistor whereas the latter effect is a function of the physical distance of the logic cell from the sleep transistor.

Fig 3.3: (a) MTCMOS circuit structure (b) The circuit model with virtual ground interconnected and sleep transistor modeled as resistors R1 and R2 respectively Fig 3.3(a) depicts a logic block LB, in which a group of low-Vth logic cells are first connected to the virtual ground node and then through a high-Vth sleep transistor, S, to the actual ground, GND . Fig 3.3 (b) models the virtual ground interconnection and the high- Vth sleep transistor, which behaves like a linear resistor in the active mode of the circuit operation, as resistors Ri and Rs, respectively. The virtual ground is at voltage Vx above the actual ground, i.e., ( VX =I.( Rs+Ri) where I is the current flowing through the virtual ground sub-network and the sleep transistor. The voltage drop across Rs + Ri reduces the gate overdrive voltage of MTCMOS logic cells (i.e., their Vgs value) from Vdd to Vdd Vx. An optimal algorithm for placing sleep transistors for the standard cell-based layout design, which minimizes the performance degradation of MTCMOS circuits due to the interconnect resistance of the virtual ground network. Technology scaling causes sub threshold leakage currents to increase exponentially. As technology scales into the deep-submicron (DSM) regime, standby sub threshold leakage power increases exponentially with the reduction of the supply voltage (VDD) and the threshold voltage (Vth). For many event driven applications, such as mobile devices where circuits spend most of their time in an idle state with no computation, standby leakage power is especially detrimental on overall power dissipation. Multi-Threshold CMOS (MTCMOS) is an effective circuit-level methodology that provides high performance in the active mode and saves leakage power during the standby mode. The basic principle of the MTCMOS technique is to use low Vth transistors to design the logic gates where the switching speed is essential, while the high Vth transistors are used to effectively isolate the logic gates in standby state and limit the leakage dissipation. In the active mode, the sleep transistor works as a resistor.


A downside of using Multi-Threshold CMOS (MTCMOS) technique for leakage reduction is the energy consumption during transitions between sleep and active modes. Previously, a charge recycling (CR) MTCMOS architecture was proposed to reduce the large amount of energy consumption that occurs during the mode transitions in power gated circuits. Considering the RC parasitic of the virtual ground and VDD lines, proper sizing and placement of charge recycling transistors is key to achieving the maximum power saving. Power gating technique provides low leakage and high performance operation by using low Vt transistors for logic cells and high Vt devices as sleep transistors for disconnecting logic cells from power supply and/or ground. This Multi-threshold CMOS technology reduces the leakage in the sleep mode. One of the key concerns in MTCMOS is the wake up time latency of the circuit, which is defined as the time required to turn on the circuit after receiving the wake up signal. Reducing the wake up time latency is an important issue since it can affect the overall performance of the VLSI circuit. Another important issue in power gating is minimizing the energy wasted during mode transition, i.e., while switching from active to sleep mode and vice versa. Both virtual ground and virtual VDD nodes experience voltage change during mode transition. Since there is considerable number of cells connected to the virtual ground and virtual supply nodes, the total switching capacitance at these nodes is large, and as a result the switching power consumption during mode transition can be significant. Sleep transistor sizing is an important issue in designing the MTCMOS circuits. Charge recycling technique has been recently proposed in order to reduce the energy consumption during mode transition of MTCMOS circuits. It has been shown that by applying this technique, up to 46% of the switching energy due to mode transition can be saved. The MTCMOS circuit scheme is a very efficient low-power and high performance circuit technique that employs high Vth transistors to switch on and off the power supplies to the low Vth logic blocks. 3.3 DIMENSIONING OF MTCMOS CIRCUITS The MTMCOS technique is a well known way to combine high switching speed with low standby current, by using low-Vt transistors for the logic part and high-Vt transistors for the so-called sleep transistors. However, a practical analytic formula, how to correctly dimension the sleep transistor for a demanded performance, has not been provided. MTCMOS circuits can be simplified by using NMOS sleep transistors, see Fig.3.4 These transistors will be in their linear mode when the circuit is active. The logic transistors, 13

however, will work in the saturation region. Since the current through logic and sleep transistor must be identical, the following equation describes the resulting ground shift VH due to the sleep transistor.

Fig 3.4: Modified MTCMOS design with NMOS sleep control transistor

The factor q (q>1) is used to describe the specified delay time factor of the MTCMOS circuit in comparison to a standard CMOS configuration. With the help of Equation (1) it is possible to calculate the necessary width WH of a sleep transistor, with WL as the accumulated width of all low-Vt logic transistors that are controlled by the sleep transistor.

A drawback of the common MTCMOS technique is the floating of nodes in the circuits. To prevent data from being lost, circuitry must be added to each flip flop. 3.4 MTCMOS APPROACHES HAVE THREE SHORTCOMINGS First, process modifications for supporting the high-VTH of the sleep MOSFET are required. Second, when a circuit goes into the sleep state, it takes a non-negligible amount of time to wake up and re-activate because the large sleep transistor must be switched on and it must initially discharge the slow virtual ground capacitance. Third, gates into the sleep region may be interfaced with gates outside. This means that the outputs of inactive gates (gates into the sleep region) can float at intermediate voltages, causing large short-circuit currents in the active gates they drive.


In recent years, technology scaling has increased the role of leakage power in the overall power consumption of circuits. Supply voltage reduction is a widely accepted methodology for reducing dynamic power, but it has an adverse effect on circuit performance. To maintain high performance, the threshold voltage Vt must also be scaled down which causes an exponential increase in the sub-threshold leakage currents. This is a more potent problem in deep-sub micron technologies. In applications which involve large standby times, this high sub-threshold leakage can be detrimental to the overall power consumption of the circuit. Multi-threshold CMOS has emerged as an effective technique for reducing sub-threshold currents in the standby mode while maintaining circuit performance. MTCMOS technology essentially places a sleep transistor on gates and puts them in sleep mode when the circuit is non-operational. State of the art techniques in leakage optimization using MTCMOS essentially assign a sleep transistor to each gate and size them such that all gates have a fixed slowdown. This is followed by a clustering approach that clusters gates with mutually exclusive switching patterns. This reduces the overall area penalty of the MTCMOS transistor. There are several problems in this approach. First the traditional approach sizes the sleep transistors such that all gates have the same slowdown. It does not investigate the possibility of slowing down non-critical gates more than critical gates for better improvements in leakage. Second, it has been shown that clustering MTCMOS gates has adverse effects on signal integrity due to ground bounce issues. In this work we address these issues by developing a fine grained methodology for MTCMOS based leakage optimization. First assign sleep transistors selectively to gates such that the overall slack could be effectively utilized. Moreover, dont perform clustering, hence the signal integrity issues are not critical in our approach. As shown in figure 3.5(a), low Vt logic modules or gates are connected to the virtual supply rails through high Vt sleep transistors which behave similar to a linear resistor in active mode as shown in figure 3.5(b). The high threshold sleep transistor is controlled using the Sleep signal and limits the leakage current to a low value in the standby mode. The load dependent delay di of a gate i in the absence of a sleep transistor can be expressed as


where CL is the load capacitance at the gate output, VtL is the low voltage threshold =350mV, Vdd = 1.8 V and is the velocity saturation index ( 1.3 in 0.18-m CMOS technology). In the presence of a sleep transistor, the propagation delay of a gate can be expressed as

Where Vx is the potential of the virtual rails as shown in figure 1 and K is the proportionality constant. Let us suppose Isleep ON is the current flowing in the gate during active mode of operation. During this mode, the sleep transistor is in the linear region of operation. Using the basic device equations for a transistor in linear region, the drain to source current in the sleep transistor (which is the same as Isleep ON) is given by

The sub-threshold leakage current Ileak in the sleep mode will be determined by the sleep transistor and is expressed as given by

Where n is the N-mobility, Cox is the oxide capacitance, Vth is the high threshold voltage (= 500 mV), VT is the thermal voltage = 26mV and n is the sub-threshold swing parameter. Equation 2 establishes a relation between delay of a gate disleep and Vx. By replacing Vx in equation 4 in terms of disleep (using equation 2), we get a dependence between (W/L) sleep and disleep (assuming the ON current is constant for each gate). Thus, a range of (W/L) sleep for the sleep transistor would correspond to a range of gate delays. Finally, (W/L) sleep in equation 5 can be replaced in terms of disleep, hence establishing a relationship between gate delay and gate leakage. The final relation between leakage and delay can be expressed as

This relationship exists for only those gates that have a sleep transistor assigned to them. Note that the moment a sleep transistor is assigned, some delay penalty is incurred. The range of delay that a gate can have is decided by the range of the acceptable (W/L) sleep. The objective of sleep transistor sizing is to decide the best values of (W/L) sleep for all sleep transistors such that the global delay constraint is satisfied and the total leakage is minimized.


Fig 3.5 : Sleep Transistors in MTCMOS circuits


1. Low Vth Transistors (lvt) The Low Vth transistor type is the fastest available favor in the STM 90nm general purpose technology, and is used for applications where the speed is of primary importance. The disadvantage of this type of transistors is that, due to the low threshold voltage (Vth), the static power is very high. 2. Standard Vth Transistors (svt) The Standard Vth transistor type is an all-purpose favor where delay and static power has been traded-o to match typical design requirements. The procedure used to characterize this technology variation is exactly the same as the one used for lvt. 3. High Vth Transistors (hvt) The High Vth transistor type is a favor especially optimized for extremely low static power consumption. Typical applications for this technology variation are circuit idle most of the time and/or where speed/performance are not of utmost importance. The procedure used to characterize this technology variation is exactly the same as the one used for lvt.



-Performance can be improved and the leakage current minimized. -Sub-threshold leakage current is reduced by the sleep transistor while performance loss is controlled.


MTCMOS has a serious problem that the stored data of latches and flip-flops in logic blocks cannot be preserved when the power supply is turned off (sleep mode).Therefore, extra circuits and complex timing design must be provided for holding the stored data. These cause great penalties on performance, power and area of the system.


Placing a high Vth PMOS transistor between Vdd and the logic block results in the MTCMOS design with PMOS sleep control transistor as shown in fig 3.6 .

Fig 3.6 : Modified MTCMOS design with PMOS sleep control transistor



The project involves in the implementation of SRAM using MTCMOS technique in Cadence- Virtuoso Analog Design Environment.



Fig 4.1: Design Flow of In CMOS (complementary MOS) logic, only the two complementary MOSFET SRAM transistors: n-channel also known as NMOS, and p-channel also known as PMOS are used to create the circuit. The logic symbols for the NMOS and PMOS transistors are shown in Figure (a) and Figure (b), respectively. In designing CMOS circuits, we are interested only in the three connectionssource, drain, and gateof the transistor. The substrate for the NMOS is always connected to ground, while the substrate for the PMOS is always connected to VCC. Notice that the only difference between these two logic symbols is that one has a circle at the gate input, while the other does not. Using the convention that the circle denotes active-low i.e., a 0 activates the signal for PMOS, the NMOS gate input is active-high.

The operation of the NMOS transistor is as follows: When the input at gate is a 1, the NMOS transistor is turned on or enabled, and the source input that is supplying the 0 can pass through to the drain output through the connecting n-channel. However, if the source has a 1, the 1 will not pass through to the drain even if the transistor is turned on, because the NMOS does not create a p-channel. Instead, only a weak 1 will pass through to the drain. On the other hand, when the gate is a 0 (or any value other than a 1), the transistor is turned off, and the connection between the source and the drain is disconnected. In this case, the drain will always have a high impedance Z value 19

independent of the source value. The (dont-care) in the Input Signal column means that it doesnt matter what the input value is, the output will be Z. The high-impedance value, denoted by Z, means no value or no output. This is like having an insulator with an infinite resistance or a break in a wire, therefore, whatever the input is, it will not pass over to the output.

Fig 4.2 :NMOS symbol

Fig 4.3:Truth Table

The PMOS transistor works exactly the opposite of the NMOS transistor. The operation of the PMOS transistor is as follows. When the input at gate is a 0, the PMOS transistor is turned on or enabled, and the source input that is supplying the 1 can pass through to the drain output through the connecting p-channel. However, if the source has a 0, the 0 will not pass through to the drain even if the transistor is turned on, because the PMOS does not create an n-channel. Instead, only a weak 0 will pass through to the drain. On the other hand, when the gate is a 1 (or any value other than a 0), the transistor is turned off, and the connection between the source and the drain is disconnected. In this case, the drain will always have a high-impedance Z value independent of the source value.



Fig 4.4: PMOS Transistor a)symbol b) Truth table


When the gate input is a 1, the bottom NMOS transistor is turned on while the top PMOS transistor is turned off. With this configuration, a 0 from ground will pass through the bottom NMOS transistor to the output while the top PMOS transistor will output a highimpedance Z value. A Z combined with a 0 is still a 0, because a high-impedance is of no value. Alternatively, when the gate input is a 0, the bottom NMOS transistor is turned off while the top PMOS transistor is turned on. In this case, a 1 from VCC will pass through the top PMOS transistor to the output while the bottom NMOS tr ansistor will output a Z. The resulting output value is a 1.

(b) (a)
Fig 4.5 INVERTER (a) circuit (b)truth table


Fig 4.6 Switch model for INVERTER (a) low input; (b) high input


If either input is LOW, the output Z has a low-impedance connection to VDD through the corresponding on p-channel transistor, and the path to ground is blocked by the corresponding off n-channel transistor. If both inputs are HIGH, the path to VDD is blocked, and Z has a low-impedance connection to ground.

(b) (c)

Fig 4.7 NAND GATE (a)circuit (b) truth table (c) symbol


Fig 4.8 : Switch model for 2 input NAND gate (a) both inputs low; (b) one input high; (c) both inputs high


When the E input is asserted, the Q output follows the D input. In this situation, the latch is said to be open and the path from D input to Q output is transparent; the circuit is often called a transparent latch for this reason. When the E input is negated, the latch closes; the Q output retains its last value and no longer changes in response to D, as long as E remains negated.

Fig :4.9 D-Latch with enable

4.6 TRI-STATE BUFFER A tri-state buffer, as the name suggests, has three states: 0, 1, and a third state denoted by Z. The value Z represents a high-impedance state, which acts like a switch that is 23

opened or a wire that is cut. Tri-state buffers are used to connect several devices to the same bus. A bus is one or more wire for transferring signals. If two or more devices are connected directly to a bus without using tri-state buffers, signals will get corrupted on the bus because the devices are always outputting either a 0 or a 1. However, with a tri-state buffer in between, devices that are not using the bus can disable the tri-state buffer so that it acts as if those devices are physically disconnected from the bus. At any one time, only one active device will have its tri-state buffers enabled, and thus, use the bus. The active high enable line E turns the buffer on or off. When E is de-asserted with a 0, the tri-state buffer is disabled, and the output y is in its high-impedance Z state. When E is asserted with a 1, the buffer is enabled, and the output y follows the input d. The truth table is derived as follows. When E = 0, it does not matter what the input d is, we want both transistors to be disabled so that the output y has the Z value. The PMOS transistor is disabled when the input A = 1; whereas, the NMOS transistor is disabled when the input B= 0. When E = 1 and d = 0, the output y is 0. To get a 0 on y, we need to enable the bottom NMOS transistor and disable the top PMOS transistor so that a 0 will pass through the NMOS transistor to y. When E = 1 and d = 1, the output y is 1. Here we need to do the reverse by enabling the top PMOS transistor and disabling the bottom NMOS transistor.

Fig 4.10: Tri-state buffer: (a) truth table; (b) logic symbol; (c) circuit; (d) truth table for the control portion of the tri-state buffer circuit



Each bit in a static RAM chip is stored in a memory cell similar to the circuit shown in Fig 4.11 (a). The main component in the cell is a D latch with enable. A tri-state buffer is connected to the output of the D latch so that it can be selectively read from. The Cell enable signal is used to enable the memory cell for both reading and writing. For reading, the Cell enable signal is used to enable the tri-state buffer. For writing, the Cell enable together with the Write enable signals are used to enable the D latch so that the data on the Input line is latched into the cell.

Fig4.11 :Memory cell (a) circuit; (b) logic symbol.


The write operation begins with a valid address on the address lines and valid data on the data lines, followed immediately by the CE line being asserted. As soon as the WR line is asserted, the data present on the data lines is written into the memory location that is addressed by the address lines. A memory read operation also begins with setting a valid address on the address lines, followed by CE going high. The WR line is then pulled low, and shortly after, valid data from the addressed memory location is available on the data lines.

Fig4.12 Memory Timing Diagram (a) read operation (b) write operation.




In SRAM, there is a set of data lines, Di, and a set of address lines, Ai. The data lines serve for both input and output of the data to the location that is specified by the address lines. The number of data lines is dependent on how many bits are used for storing data in each memory location. The number of address lines is dependent on how many locations are in the memory chip. In addition to the data and address lines, there are usually two control lines: chip enable (CE), and write enable (WR). In order for a microprocessor to access memory, either with the read operation or the write operation, the active-high CE line must first be asserted. Asserting the CE line enables the entire memory chip. The active-high WR line selects which of the two memory operations is to be performed. Setting WR to a 0 selects the read operation, and data from the memory is retrieved. Setting WR to a 1 selects the write operation, and data from the microprocessor is written into the memory. Instead of having just the WR line for selecting the two operations, read and write, some memory chips have both a read enable and a write enable line. In this case, only one line can be asserted at any one time. The memory location in which the read and write operations are to take place, of course, is selected by the value of the address lines. The operation of the memory chip is shown in Figure 4.8(b).

Fig 4.13 A 2n x m RAM chip: (a) logic symbol; (b) operation table. Notice in Fig 4.13(a) that the RAM chip does not require a clock signal. Both the read and write memory operations are not synchronized to the global system clock. Instead the data operations are synchronized to the two control lines, CE and WR. To create a 8X8 static RAM chip, we need 64 memory cells forming a 8X8 grid, as shown in Figure 4.8.2. Each row forms a single storage location, and the number of memory cells in a row determines the bit width of each location. So all of the memory cells in a row are enabled 26

with the same address. Again, a decoder is used to decode the address lines, A0, A1, A2. In this example, a 3 to 8 decoder is used to decode the eight address locations. The CE signal is for enabling the chip, specifically to enable the read and write functions through the two AND gates. The data comes in from the external data bus, Di, through the input buffer and to the Input line of each memory cell. The purpose of using an input buffer for each data line is so that the external signal coming in, only needs to drive just one device (the buffer) rather than having to drive several devices (i.e., all of the memory cells in the same column). Which row of memory cells actually gets written to will depend on the given address. The read operation requires CE to be asserted and WR to be de-asserted. This will assert the internal RE signal, which in turn will enable the eight output tri-state buffers at the bottom of the circuit diagram. Again, the location that is read from is selected by the address lines.

Fig 4.14 A 8X8 SRAM chip circuit.





Fig 5.1: Output waveforms of SRAM without MTCMOS






Fig 5.2: Output waveforms of SRAM with MTCMOS






1 2 3 4 5 6 7


117.5nW 248.9nw 1.507uW 1.216uW 9.47uW 1.559uW 97.92uW

71.53nW 98.12nW 0.6268uW 0.801uW 2.113uW 0.368uW 13.11uW


Table 5.1 Comparison of power consumed with and without MTCMOS


6.1 REFERENCE BOOKS Jack Horgan, Low Power Soc Design, EDAWeekly Review May 17 - 21, 2004 Cadence, Low Power in EncounterTM RTL Compiler, Product Version 5.2, December 2005 Cadence, Cadence Low Power Design Flow Cadence, Low Power Application Note for RC 4.1 and SoCE 4.1 USR3, Version 1.0,1/14/2005 V.Kursun and E. G. Friedman,Multi-Voltage CMOS Circuit Design.New York: Wiley, 2006. A. Chandrakasan and B. Brodersen, editors,Low Power CMOS Design", IEEE Press, 1998. J.K. Kao and A. Chandrakasan,Dual-Threshold Voltage Techniques for Low-Power Digital Circuits",IEEE Journal of Solid State Circuits, Vol. 35, No. 7,pp. 1009-1018, July 2000. Liqiong Wei, Zhanping Chen, Roy, K., Yibin Ye, De, V., Mixed-Vth (MVT) CMOS Circuit Design Methodology for Low Power Applications Design Automation Conference, 1999. Proceedings. 36th, Jun. 1999, pp. 430-435. M. Anis, S. Areibi, and M. Elmasry, Design and Optimization of Multithreshold CMOS(MTCMOS) Circuits, IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems,vol. 22, no. 10, pp. 1324-1342, Oct. 2003. S. Sirichotiyakul and et al., Stand-by Power Minimization through Simultaneous Threshold Voltage Selection and Circuit Sizing, Proc. of the DAC, pp. 436-441, 1999. Essentials of VLSI circuits and systems- Kamran Eshraghain , Eshraghian Dougles and A.Pucknell,PHI,2005 Edition Digital Design Principles &Practices- John F. Wakerly , PHI/ Pearson Education Asia, 3rd Ed., 2005 Digital Logic and microproccesor design with VHDL-Enoch O.Hwang


A.1 Cadence: VirtuosoAnalog Design Environment

Cadence is an Electronic Design Automation (EDA) environment which allows different applications and tools to integrate into a single framework thus allowing to support all the stages of IC design and verification from a single environment. These tools are completely general, supporting different fabrication technologies.

Fig A.1 :Cadence design flow

A.2 Various Design steps

Firstly a schematic view of the circuit is created using the Cadence Composer Schematic Editor. Alternatively, a text netlist input can be employed. Then, the circuit is simulated using the Cadence Affirma analog simulation environment. Different simulators can be employed, some sold with the Cadence software (e.g., Spectre) some from other vendors (e.g., HSPICE) if they are installed and licensed. 1. Invoking Cadence tool The command Interpreter Window can be invoked by typing icfb90The tool is available on vlsi34, vlsi35, vlsi36, vlsi27. The following window will appear on the screen on invoking the command. 36

Fig A.2 Log Window 2. Create Library In order to create the library go to Tools >Library Manager on the Tools menu of the CIW.

Fig A.3 Library window Now to create a new library go to File >New >Library from the File menu of the Library Manager.


Fig A.4 Library Creation window

3. Create Schematic
Start by clicking on the library (created by you) in the Library Manager window, then go to File >New >Cell View and fill in with Inverter ( in this case) as the cell name, schematic as the view name, and Composer Schematicas the tool, then press OK.

Fig A.5 File Creation window


An empty window appears as the next figure.

Fig A.6 Schematic Window Now place the instances. Add the I/O pins.Add the wires. Now you need to Check and Save your design (either the top left button or Design >Check and Save). Make sure you look at the CIW window and there are no errors or warnings, if there are any you have to go back and fix them! Assuming there are no errors we are now ready to start simulation!


3 Simulation
In the Virtuoso Schematic window go to Tools >Analog Environment. The design should be set to the right Library, Cell and View.

Fig A.7 Simulation window

5.Choosing the Analyses

In the Affirma Analog Circuit Design Environment window, click Analysis Choose pull down menu to open the analyses window.Several analyses modes are set up.


6.Transient Analysis
In the Analysis Section, select transient time and set the Stop Time and Before Clicking OK button, click APPLY button.

Fig A.8 Analysis Window 7. Saving and Plotting Simulation Data Select Output To be Plotted Select on Schematic to select nodes to be plotted. By clicking on the wire on the schematic window to select voltage node, and by clicking on the terminals to select currents. Select the input and output wires in the circuit. Observe the simulation window as the wires get added. 8.Run the Simulation Click on the Run Simulation icon. When it completes, the plots are shown automatically.



Fig B.1 : Inverter without MTCMOS


Fig B.2 :Inverter with MTCMOS



Fig B.3 :SRAM without MTCMOS


Fig B.4 :SRAM with MTCMOS





Fig B.5 : Power Calculation window without MTCMOS



Fig B.6 :Power Calculation window with MTCMOS