## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

**Closing the Power Gap between ASIC and Custom
**

David Chinnery, Kurt Keutzer

Outline

Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Conclusions on automating low power techniques

3

Why power?

Battery life is limited by power (e.g. laptop, mobile phone) Cost for packaging and cooling increase rapidly with power dissipation (e.g. plastic vs. ceramic package, heatsink, fan) Higher temperatures degrade performance and reliability

± Circuits are slower, with more leakage, at higher temperature ± Less reliable due to increased rate of electromigration

Increasing integration increases power demand in portable applications (e.g. mp3 player/PDA/mobile phone combined) Performance is limited by power now even for high end microprocessors

4

**Power of high performance chips has increased
**

Power/Unit Area (W/cm2)

As device dimensions (W, L, Tox) scaled down by a factor k, for high performance, If supply Vdd and threshold voltage Vth fixed, then power/unit area w k3 If Vdd and Vth scaled down linearly and , then power/unit area w k0.7 Further voltage scaling may be limited

data from ISSCC chips 1982-2002

1000 100 10 1

microprocessor digital signal processor

[Kuroda OYO 1 10 BUTURI 2004] Scaling Factor k (1/um)

5

n Vdd Vth. which is the major contributor to Pleakage Vdd Vth.p Vth.p Vdd 0V subthreshold leakage Vth.n Cload ± Must reduce Vth to maintain drive current But reducing Vth increases subthreshold leakage current.p dynamic power ± Reducing Vdd gives quadratic reduction in Pdynamic But transistor drive current depends on Vdd [Chen in Trans.Impact of voltage scaling on power Major components of power: Ptotal = Pdynamic + Pleakage Dynamic power due to switching of capacitances Vdd Vth.n 6 Must look for other ways to reduce power . On Electron Devices 1997] Vth.

Automate low power techniques Custom designers can try to optimize the design at all levels Electronic design automation (EDA) tools for ASICs ± Most of the design optimization is high level ± Fast time-to-market and lower design cost ± Increasingly important to reduce design cost for larger chips What is the power gap between (automated) ASIC design and custom design? ± We need to characterize the contributing factors ± Can we close the power gap? ± Identify custom techniques that can be used in an EDA flow 7 .

Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Conclusions on automating low power techniques 8 .

stalled pipeline stages) ± Energy/operation is the inverse of throughput/unit power ± Maximize throughput/unit power or minimize energy/operation 9 . e.What is our metric for power? Power ± Fixed performance constraint (clock frequency or throughput e.g.g. MIPS/mW ± Cycles per instruction (CPI) accounts for impact of architectural choices (e.g. 30 frames/s for MPEG2) ± Reduce the power and meet the performance constraint Energy efficiency ± No performance constraint ± Throughput/unit power (1/PvTvCPI).

18 Process Technology (um) 0.0 0.0 2.13 ×1.60 0.50 0.1 MIPS/mW Comparison of Custom and Hard Macro ARM Implementations XScale 3.0 1.4 gap between ARM7TDMI-S and ARM7TDMI ×3 to ×4 overall from synthesizable to custom ARMs 10 .What is the power gap? ARM cores ×2 to ×3 gap between custom and hard macro ARMs Dhrystone 2.25 0.0 StrongARM Burd 0.35 0.3 to ×1.

What is the power gap? DCT/IDCT blocks ×4 to ×7 between discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) blocks. after scaling linearly for technology [Fanucci ICECS 2002] ± We assumed power reduces linearly with technology To get 30 frame/s MPEG2 with a general purpose processor would require two ARM9 cores and would consume 15× power [Fanucci ICECS 2002] ± Application-specific hardware substantially reduces power 11 .

Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Conclusions on automating low power techniques 12 .

control. so we won t discuss memory here 13 .Breakdown of power by functionality Typical breakdown of on-chip power consumption for an embedded microprocessor Clock 20% to 40% Memory 20% to 40% Control + datapath 40% to 60% Input/output to off-chip ~5% Most of power is in datapath. clock tree and memory ± Techniques focus on reducing this power ± Several companies provide custom memory for ASIC processes.

6 ×1.1 Voltage scaling. multi-Vth.0 High speed logic styles (DCVSL. multi-Vdd ×4.0 ×1.4 ×1. parallelism) ×2.3 Technology mapping ×1.3 ×1.2 ×1. PTL.0 Cell sizing and wire sizing ×1.0 Logic design ×1.0 Floorplanning and placement ×1.6 ×1. domino) ×1.6 ×1.6 ×1.3 Clock gating and power gating ×1.1 Process variation and process technology ×2.2 14 .5 ×1.Summary of factors effect on active power Automated designs are higher power than custom because of ASIC design quality Factor typical excellent Microarchitecture (pipelining.

Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Conclusions on automating low power techniques 15 .

6 ×1.6 ×1.3 ×1.1 ×4.1 ×2.2 ×1.0 ×1. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.3 ×1.5 ×1.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.0 ×1. PTL.0 ×1. multi-Vth.6 ×1.6 ×1.4 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.0 ×1.3 ×1.2 Conclusions on automating low power techniques 16 .0 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.

and branch prediction instruction fetch instruction decode ALU memory access write back instruction fetch instruction decode insert registers ALU memory access write back Parallelism increases throughput in exchange for increased area Limited by ± Routing. control overheads 17 .Microarchitecture leverage for voltage scaling and sizing Increase throughput/cycle to allow Vdd reduction Pipelining inserts registers. multiplexing. ± Power and delay for registers. increasing throughput Limited by ± Reduction in instructions/cycle (1/CPI) due to branch misprediction. data forwarding logic. etc. waiting to read or write memory.

ASIC is v2.Microarchitecture: pipelining model leverage for voltage scaling and sizing Pipeline power model [Harstein 2003]: ± n stages.025/stage for custom.1 latch growth vs.6 FO4 total. and 0.050 vs. n.1um).05/stage for ASICs Add fits for dynamic and leakage power with voltage scaling and sizing At 40 FO4 delay constraint (500MHz for Leff=0. L=1. same tcombinational of 175 FO4 CPI penalty 0.6 1/(energy/operation) ASIC 18 .6 worse custom 1/(energy/operation) 0. 0.019 => ×2.05 for register power Minimum stage delay: ± ASIC tpipelining overhead of 10 FO4 (register delay) + 10 FO4 (imbalance) ± Custom tpipelining overhead of 2. F=0.

and wider gates to compensate 19 . 20% power overhead v2 Without pipeline: Vdd=2.8 (typical) to v1.2V to meet throughput Parallel datapaths [Bhavnagarwala IEEE Trans. VLSI 00] v2 to v4 reduction in power by reducing Vdd by increasing throughput with parallel datapaths Microarchitecture speed gap is v1. lower Vth.3 worse power due to higher Vdd.Microarchitecture leverage for voltage scaling and sizing Custom IDCT pipelining to reduce Vdd [Xanthapoulos JSSC 99] With pipeline: Vdd=1.32V. this corresponds to about v2.3 (excellent) At a tight delay constraint.6 to v1.

6 ×1. multi-Vth.6 ×1.3 ×1.4 ×1.0 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.5 ×1.6 ×1.1 ×4.1 ×2.0 ×1.0 ×1.2 Conclusions on automating low power techniques 20 .0 ×1.3 ×1. PTL.2 ×1.0 ×1.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.3 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.6 ×1.

g.1. Logic is lower activity ~0.0 Clock signal has high activity.6 to v1.Clock gating v1.3 [Hsu ISLPED 02] ASICs can do this add insert clock gating add select_add clock select_shift shift 21 clock shift . clock gating and avoiding computation reduces power by v10 [August SOC 01] Typical savings are up to v1. reduce precision for DCT/IDCT coefficients ± Precomputation control signals reduces power by v1. Turn off clocks to inactive modules Some DCT/IDCT registers are active < 3% of time.6 power reduction Power minimization tools automatically insert gated clocks Designer can make microarchitectural/algorithm decisions ± E.4 to v3. 2.

Power gating reduces leakage in standby Turn off leakage path in inactive modules ± May need to preserve the state registers Can reduce standby leakage by 3 orders of magnitude [Mutoh JSSC 95] Other approaches ± reverse biasing the substrate ± setting input vectors to low leakage states. gives v1.4 leakage reduction [Lee DAC 03] Just now getting ASIC methodology support ± Need large sleep transistors to turn off power ± Sleep transistors reduce available supply voltage select_add clock select_shift shift 22 add .

2 Conclusions on automating low power techniques 23 .0 ×1.6 ×1.3 ×1.0 ×1.0 ×1. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling. multi-Vth.0 ×1.6 ×1.3 ×1.1 ×4.5 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.3 ×1.4 ×1.2 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.0 ×1.6 ×1. PTL.6 ×1.1 ×2.

larger capacitance PMOS transistors in series static CMOS DCVSL PTL domino 24 . robust PMOS pullup series transistors are slow Faster custom logic styles speedup critical paths Custom can use slack from higher speed (v1.High speed logic styles leverage for voltage scaling and sizing Low power designs use mostly static CMOS logic Static CMOS logic is low leakage.4) to reduce power by lowering Vdd ± ASIC power v1.3 worse than custom at a tight delay constraint due to logic style 32-bit Adder [Tiwari DAC¶98] domino Power 22% higher static 25% lower Delay slow.

0 ×1.4 ×1.1 ×2. multi-Vth.0 ×1.3 ×1.3 ×1.0 ×1.6 ×1. PTL.6 ×1.6 ×1.1 ×4.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.2 Conclusions on automating low power techniques 25 .5 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.3 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.6 ×1.0 ×1.0 ×1.2 ×1.

0 Technology mapping tools don t target low power We found that targeting minimum area for multipliers can result in v1.4 to v1. delay is a poor choice Technology mapping techniques to reduce active power v1.Technology mapping v1.0 ASICs can do as well as custom. 1/2 1/2 lower activity 1/2 3/8 3/8 7/32 3/8 26 1/2 . if tools improve 1/2 1/2 1/2 1/2 3/8 7/32 1/2 3/8 equivalent logic.3 power.

2 ×1.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.3 ×1.0 ×1.3 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.5 ×1.1 ×2.1 ×4.6 ×1.3 ×1.6 ×1.0 ×1.2 Conclusions on automating low power techniques 27 .6 ×1.0 ×1.0 ×1.4 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL. multi-Vth.6 ×1. PTL. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.0 ×1.

78 18.76 3.76 2.63 10% 16% 28 .35 power reduction on Xtensa processor at 325MHz by (mostly sizing) power minimization with Design Compiler and 0.649 c3540 36 1283 1.11 1.67 2.25 2.39 9.1 v1. Keutzer will be at ISLPED 05] Can do better than Design Compiler (DC) with cell sizing via linear program (LP) (global optimization vs.305 c7552 31 2779 0.847 Average savings vs.49 2.999 c2670 23 1164 0.054 c5315 34 1956 0.37 3.22 1.08 0. Design Compiler: Power (mW) 1.11 2.2 power reduction ISCAS'85 # logic Minimum Netlist levels # cells Delay (ns) c17 4 10 0.53 5.1 to v1.98 3.08 6.81 6.62 4.6 to v1.78 2.60 4.88 5.79 5.13um library [internship at Tensilica] [Chinnery.91 6.69 5.094 c432 24 259 0.Cell sizing and wire sizing v1.83 4.70 10.44 9.90 6.2T min DC LP DC LP 1.86 0.08 4.07 6.83 2. greedy pin-hole optimization).61 6.51 8.12 3.26 3.02 16. about v1.82 7.946 c6288 113 3544 3.1T min 1.700 c1355 27 764 0.63 8.60 13.701 c880 23 484 0.76 5.97 4.65 15.778 c1908 33 635 0.733 c499 25 644 0.23 8.

6 gap due to cell sizing and wire sizing.6 to v1.2 power reduction and v1.Cell sizing and wire sizing v1.1 to v1.1 Cell libraries lack fine-grained sizes and skewed P:N drives ± [Hurat SNUG 01] Generate new cells: v1. v1.7 [Gong ISLPED 96] ± v1. a good sizing tool. and design-specific cells 29 .15 faster for bus controller.2 reduction in total power ± Not available for ASIC interconnect yet Up to v1.1 using a library with finely-grained sizes. can reduce to v1.4 MHz/mW Vdd optimize transistor sizes Vdd GND GND Simultaneous buffer and wire sizing reduced clock tree power by v2.

5 ×1.0 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.6 ×1.6 ×1.0 ×1.0 ×1.0 ×1.1 ×2. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.4 ×1.3 ×1.1 ×4.3 ×1. multi-Vth.6 ×1.2 ×1. PTL.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.3 ×1.2 Conclusions on automating low power techniques 30 .0 ×1.6 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.

7 power reduction for voice. ARM 02] MIPS [Burd ISSCC 2000] Energy (mW/MIPS) Reduce Vdd and bias substrate to lower Vth ± v1.0 to v1.Dynamic supply and substrate biasing v4.0 Change Vdd based on processor load ± v10 more energy efficient at low performance [Burd ISSCC 00] ± Adaptive voltage scaling with the ARM11 gives v1. SMS. same speed [Hamada CICC 98] ± Increase Vth in standby to reduce leakage These are complicated to automate for ASICs ± Dynamic voltage requires accurate knowledge of path delays 31 .7 reduction in power. web applications [National Semiconductor.

reduces leakage by v500 ± v1.0 Basic idea: high speed where critical. low power elsewhere Dual Vdd reduces power by v1.Multiple supply and threshold voltages v4.25 to v3 average power reduction.0 to v1. depending on activities Dual Vth can give v3 to v6 reduction in leakage [Sirichotiyakul DAC 99] ASICs are limited to Vdd and Vth offered by library and foundry Can t change Vth to design-specific optimal point Standard cell libraries characterized at only two or three Vdd Dual Vdd requires level converters and dual Vdd layout 32 .7 after substrate biasing/lower Vdd [Usami JSSC 98] ± v2 reduction in clock tree power by using low Vdd Separate voltage islands [Lackey ICCAD 02] different speeds and Vdd ± Turn off Vdd to modules not in use.

0 ×1.1 ×2.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.1 ×4.0 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.5 ×1.3 ×1.0 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.3 ×1.6 ×1.6 ×1. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.4 ×1.3 ×1.6 ×1.2 Conclusions on automating low power techniques 33 .0 ×1. multi-Vth.2 ×1. PTL.0 ×1.6 ×1.

inaccurate wire loads ×1. Report 01] 34 .5 to v1. and gates will be upsized to drive the longer wires automatic place and route block partitioned [Hauck Micro.13um 42% longer wires for 200K partitions Interconnect is 20% to 40% of total power [Sylvester ICCAD 98] v1.Floorplanning and placement v1. 200K gate modules from 0.25um to 0.5 worse power than custom We compared partitioning a design into 50K vs.1 to v1.1 Poor floorplanning and cell placement.2 increase in total power due to wiring.

automated place-and-route ± up to v1. Puri.1 energy reduction [Chang SM Thesis MIT 98] ASICs still ×1.1 faster.Floorplanning and placement v1.5 energy reduction from bit slicing and some logic optimization [Stok.1 higher power than custom due to layout automatic place-and-route tiled bit-slices custom 35 . Bhattacharya.1 Bit slices can reduce wire length by 70% or more vs.5 to v1. about v1.4 energy reduction as faster and lower wiring capacitance [Chang SM Thesis MIT 98] ± v1. Cohn] Manual place-and-route achieves 10% shorter wires and v1.

PTL. multi-Vth.2 ×1.0 ×1.2 Conclusions on automating low power techniques 36 . parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.6 ×1.6 ×1.1 ×4.6 ×1.5 ×1.3 ×1.0 ×1.0 ×1.0 ×1.0 ×1.3 ×1.4 ×1.1 ×2.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2.3 ×1. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.6 ×1.

Process variation impact on power v2.5 37 .2 ASICs are designed to work at the worst case delay and worst case power corners for the process typical delay and power are less ± Simulated power was ×1.7 actual power for custom DCT/IDCT Up to a factor of v1.75 [Takahashi JSSC 98] ×1.6 to v1.3um) ×1.75 between worst and best (average power of 80 chip samples in 0.

Process variation impact on power v2. no speed test) ×1.13um Intel and AMD PC chips ± ASICs don t speed bin (they scan test.6 to v1.2 Binning would leave gap of v1.4 low power bin higher power bin 38 . after derating for Vdd and frequency) bins of 0.4 between low and high bins We found a gap of v1.2 between low speed (high power) and high speed (low power.18 and 0.

6 higher active power.13um vs. IBM 0.Process technology v2. VLSI 2001] We compared cell libraries in UMC 0. typical conditions.3 faster.13um process ± IBM cells about ×1.05 faster. UMC had ×17 leakage Overall impact of process variation and technology v2.1 to ×1.2 in a low power process. ×1.2 Low power libraries are more expensive ± 5% to 10% transistor width shrinks to reduce capacitances ± Copper is 40% lower resistivity than aluminum ± Low-k dielectric reduces wire capacitances we estimate about a ×1. no speed binning 39 .6 to v1.1 reduction in total power with a low-k dielectric ± Silicon-on-insulator is ×1. ×1.6 ASIC power relative to custom for worst case conditions and a cheap process v1.4 power reduction [Narendra Symp.

Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Conclusions on automating low power techniques 40 .

These are the biggest levers for reducing power. ± Can get 10v or more going from general purpose hardware to application-specific hardware.6 each 41 . about ×2. The largest factor for the power gap is voltage scaling responsible for up to ×4 Process and microarchitecture can be large factors.g.Low power design conclusions Typical ASIC is v3 to v7 less energy efficient than custom ± We assumed ASIC and custom designs can use the same microarchitectural and logic design techniques. ± E. Fast Fourier transform implementations as discussed in Andrew Chang s paper.

5 at a lower performance target (~2v slower) ± Make full use of scaling down Vdd and Vth 42 . and upsized gates are needed to meet performance target v1.Low power design conclusions By incorporating custom techniques can get within v3 at a high performance target ± Can t use custom logic styles ± ASIC speed penalty drags down efficiency. lower Vth. as higher Vdd.

1 increase in MHz/mW overall The third speaker. FIR filter.000 gates implementing Hilbert transform.Low power ASIC design example 0. Cohn] 240. Ruchir Puri will discuss some of their recent low power work at IBM.13um DSP example [Stok. 43 . Puri.86 increase in efficiency A fine grained standard cell library gave another v1. physical synthesis gave v1. logic design (carry save adders). bitslicing. and fast Fourier transform. with 42KB register array Technology mapping.16 Voltage scaling gave another factor of v1. Bhattacharya.46 v3.

Extra slides .

p Vth. which is the major contributor to Pstatic Vdd Vth. gate switching activity E.n subthreshold 0V leakage (Clock frequency f. X.Impact of voltage scaling on power Ptotal = Pdynamic + Pshort circuit + Pstatic Short circuit power when switching is 10% or less of Ptotal Dynamic power due to switching of capacitances ± Reducing Vdd gives quadratic reduction in Pdynamic But transistor drive current depends on Vdd [Chen in Trans.n Cload Vdd Vth. and m. On Electron Devices 1997] Vdd Vth. constants F.p Vdd Vth.) 45 .p Vth. temperature T.n ± Must reduce Vth to maintain drive current But reducing Vth increases subthreshold leakage current. Io. transistor gate oxide thickness Tox.n short circuit current Vdd dynamic power V th. capacitance C.p Vth. transistor length L.

total power high speed. leakage low power.13 leakage increasing 0. total power low power. ignoring interconnect).045 Technology (um) 0.065 0.1 fast.01 0. high Vth 0. low Vth slow.09 0.1.022 From International Technology Roadmap for Semiconductors data for 2001-2016 (assuming activity of 0.ITRS leakage power trends 1000 Power/Die Area (W/cm ) Can t scale down Vth much further due to large subthreshold leakage currents Gate tunneling leakage through thin gate oxide Tox is also becoming a significant cause of leakage Further Vdd voltage scaling will be limited Must also look to other low power techniques 100 2 high speed. leakage 10 1 0. 46 .001 0.

0 ×1.3 ×1.Summary of factors affecting (active) power Automated designs are higher power than custom because of ASIC design quality typical excellent ×2.5 ×1.1 ×2.0 ×1.2 ×1. multi-Vth. parallelism) Memory Clock gating and power gating Logic design High speed logic styles (DCVSL.6 ×1.6 ×1.0 ×1. multi-Vdd Floorplanning and placement Process variation and process technology .3 ×1. domino) Technology mapping Cell sizing and wire sizing Voltage scaling.0 ×1.3 ×1.1 ×4.6 ×1.6 ×1.0 ×1.4 ×1. PTL.0 ×1.2 47 Factor Microarchitecture (pipelining.4 ×1.

but reduced cache misses ± Pipeline stalls. waits many cycles for read/write to off-chip memory Caches with higher associativity (e. direct mapped) consume more power.Memory reduce cache misses v1.32 energy savings ± Software optimizations to reduce cache misses gave on average a v1. increasing the transistor length in the caches by 12% reduced leakage by v20 [Montanaro JSSC 96] slower off-chip memory write buffer on-chip cache processor ASICs can do this. 8-way vs.g. custom memory is available for ASICs 48 .4 to v1.0 Larger caches consume more power. also affects likelihood of a cache miss [Duarte ASIC/SOC 2001] ± Sub-banking: only precharge the need section of the cache bank.6 reduction in power 90% of the StrongARM area was caches. v1.

6 ×1.0 ×1.2 ×1. parallelism) ± Clock gating and power gating ± Logic design ± High speed logic styles (DCVSL.3 ×1. multi-Vdd ± Floorplanning and placement ± Process variation and process technology ASIC design quality typical excellent ×2. PTL. domino) ± Technology mapping ± Cell sizing and wire sizing ± Voltage scaling.6 ×1.5 ×1.4 ×1.0 ×1.2 Conclusions on automating low power techniques 49 .6 ×1.0 ×1.1 ×2.0 ×1.3 ×1.3 ×1.6 ×1. multi-Vth.1 ×4.Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and custom Where does the power go? What can we do about it? Factor ± Microarchitecture (pipelining.0 ×1.

0 Logic design refers to the topology and logic structure to implement functional units Logic switching activity of a carry select adder was v1.Logic design v1. ASIC designers can choose the same logic design as custom. reduced energy by v1. 92] 0. v1.3 energy compared to radix-4 [Zlatanovici ESSC 03] We implemented an algorithm to reduce switching activity in multipliers.8 worse than a 32-bit carry lookahead [Callaway VLSI Signal Proc.2 to v1.1 for 64-bit [Ito ICCD 03] Given similar design constraints.13um 64-bit radix-2 compound domino adder was slower and about v1.0 carry save adder x0 y0 + ripple carry adder + z0 (x+y+z)0 z1 x1 y1 + + z2 (x+y+z)1 x2 y2 x3 y3 + + z3 (x+y+z)2 + + (x+y+z)3 (x+y+z)4 50 .

- Computer Architecture
- Electrical Science g 4
- Variable Body Bias Thesis-libre
- Flyer Low Noise Feb09
- IRJET-Designing of Asynchronous Viterbi Decoder for Low Power Consumption using Handshaking Protocol
- Course Plan VLSI Design July 2013
- A Survey of Power Management Techniques of Computing Systems
- IAETSD-Design and Simulation of High Speed CMOS Full Adder (2)
- Senzor CO2
- Drowsy Caches
- Euro-Practice Annual Report 2012
- BI coding.pdf
- Clock Gating and Power Gating for Low Power Circuits_2012EE10473
- IJEAS0203039
- AREA OPTIMIZED LOW POWER ARITHMETIC AND LOGIC UNIT.pdf
- ULN2803 ULN2804
- Pelgrom_Matching_prop.pdf
- HT640 datasheet
- Circuit Multi Project-IC's Fabrication Prices Between Micro and Nano Scale
- vldtsyll
- 10-11 Electrical and Electronic Engineering
- VLSI CMOS Fabrication Technology
- 6T SRAM Operation
- Worst Case Test Vector Exposed To Total Dose In.pdf
- 74HC245.REV1
- 22019a
- ELX304 Ref Exam
- Question Bank
- A2.pdf
- lect15

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading