You are on page 1of 52

VLSI Digital Signal Processing Systems

Low-Power CMOS VLSI Design


Lan-Da Van (), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2010
ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/

VLSI Digital Signal Processing Systems

Outline
Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References

Lan-Da Van

VLSI-DSP-14-2

VLSI Digital Signal Processing Systems

Low Power Design An Ongoing and Important Discipline


Historical figure of merit for VLSI design

Performance (circuit speed and system quality) Chip area (circuit cost). But now,

Power dissipation is now an important metric in VLSI design.

No single major source for power savings across all design levels Required a new way of THINKING!!! Companies lack the basic power-conscious culture and designers need to be educated in this respect.

Overall Goal - To reduce power dissipations but maintaining adequate throughput rate.

Lan-Da Van

VLSI-DSP-14-3

VLSI Digital Signal Processing Systems

Motivation - Microprocessor

Lan-Da Van

VLSI-DSP-14-4

VLSI Digital Signal Processing Systems

Low Power Competitive Reasons


Battery Powered Systems

Extend battery life Reduce weight and size Cost Package (chip carrier, heat sink, card slots, ) Power Systems (supplies, distribution, regulators, ) Fans (noise, power, reliability, area, ) Operating cost to customer Re-start issue.

High-Performance Systems

Reliability
Failure rate increases by 4X for T @ 110C vs 70C

Size and Weight


Lan-Da Van VLSI-DSP-14-5

VLSI Digital Signal Processing Systems

The Power Crisis: Portability

PDA, Cellular Phone, Notebook Computer,etc.

Expected Battery Lifetime increase Over next 5 years: 30-40%


Lan-Da Van VLSI-DSP-14-6

VLSI Digital Signal Processing Systems

A Multimedia Terminal: The Infopad

Present day battery technology (year 1990) 20 lbs for 10hrs


Lan-Da Van VLSI-DSP-14-7

VLSI Signal Processing System Design Space

VLSI Digital Signal Processing Systems

Cost

Performance

System Level

Test

Area

Algorithm Level Architecture Level Logic Level

Power
Lan-Da Van

Circuit Level Process Level


VLSI-DSP-14-8

VLSI Digital Signal Processing Systems

Low Power System Design Space


System
Power budgeting, S/H partitioning, power management, core selection Algorithmic reduction, data transformation, CSE, low-complexity operation Parallelism, pipelining, re-timing, unfolding, signal ordering, glitch minimization, data representation, resource allocation, multi-clock Logic style, arithmetic, glitch/noise minimization, re-sizing, adaptive voltage scaling, multi-Vdd, multi-Vth, multi-clock, layout, power-driven P&R Low-power device, alternative technology, multiVth
Lan-Da Van VLSI-DSP-14-9

Algorithm
Architecture
Logic/Circuit

Process

VLSI Digital Signal Processing Systems

Outline
Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References

Lan-Da Van

VLSI-DSP-14-10

VLSI Digital Signal Processing Systems

Where Does Power Go in CMOS?


Source of power dissipation
P = Pdynamic + Pshort-circuit + Pleakage + Pstatic

Definitions:

Dynamic/switching power: P = CV2f


Charging and discharging parasitic capacitors : switching activity factor

Short circuit power Leakage power

P = IscV P = IleakageV

Direct path between supply rail during switching

Reverse bias diode leakage Sub-threshold conduction

Static power

P = IstaticV

Each input node is connected to fixed stable voltage

Lan-Da Van

VLSI-DSP-14-11

VLSI Digital Signal Processing Systems

Dynamic Power Consumption (1/2)

Power = Energy/transition * transition rate = CL * Vdd2 * f0->1 = CL * Vdd2 * Pb0->1 * f = CEFF * Vdd2 * f = Pb0->1 *CL*Vdd2 * f

CEFF = Effective Capacitance = CL * Pb0->1


Lan-Da Van VLSI-DSP-14-12

VLSI Digital Signal Processing Systems

Dynamic Power Consumption (2/2)


Need to reduce Pb0->1 , CL, Vdd, and f for low power design

Reduce the probability, P0 -> 1 Minimize the geometry and remove the redundancy Reduce the power supply level Use lowest clock frequency

Power dissipation is data dependent function of switching activity. => Pattern Dependent!

Lan-Da Van

VLSI-DSP-14-13

VLSI Digital Signal Processing Systems

Choice of Logic Style

Lan-Da Van

VLSI-DSP-14-14

VLSI Digital Signal Processing Systems

Choice of Logic Style


Power-delay product improves as voltage decreases The best logic style minimizes power-delay (i.e, energy) for a given delay constraint.

Lan-Da Van

VLSI-DSP-14-15

VLSI Digital Signal Processing Systems

Type of Logic Function: NOR


Example : Static-style 2-input NOR gate A 0 0 1 1 B 0 1 0 1 Out 1 0 0 0

Truth Table of 2-Input NOR Gate

Assume : P(A=1) = P(B=1) = Then : P(Out=1) = P(01) = P(Out=0)*P(Out=1) =3/4 * 1/4 = 3/16 0->1 = 3/16

Lan-Da Van

VLSI-DSP-14-16

VLSI Digital Signal Processing Systems

2-Input NOR Gate Transition Probability

P1=(1-PA)(1-PB) P0->1 =P0P1=(1-(1-PA)(1-PB))(1-PA)(1-PB)


Lan-Da Van VLSI-DSP-14-17

VLSI Digital Signal Processing Systems

Type of Logic Function: XOR


Example : Static-style 2-input XOR gate Assume : P(A=1) = 1/2 P(B=1) = 1/2 Then : P(Out=1) = 1/2 P(01) = P(Out=0)*P(Out=1) =1/2 * 1/2 = 1/4 0->1 = 1/4

A 0

B 0

Out 0

0
1 1

1
0 1

1
1 0

Truth Table of 2-Input XOR Gate

Lan-Da Van

VLSI-DSP-14-18

VLSI Digital Signal Processing Systems

2-Input XOR Gate Transition Probability

P1=PA(1-PB)+PB(1-PA)=PA+PB-2PAPB P0->1 =P0P1=(1-(PA+PB-2PAPB))(PA+PB-2PAPB)


Lan-Da Van VLSI-DSP-14-19

VLSI Digital Signal Processing Systems

Which One is Your Choice?

XOR

NOR

Which one is for Low-Power design?


Lan-Da Van VLSI-DSP-14-20

VLSI Digital Signal Processing Systems

Glitching Activity in CMOS Network

(x,c=0,0)

(x,c=1,0)

0->1 can be greater than 1 due to glitching!


Lan-Da Van VLSI-DSP-14-21

VLSI Digital Signal Processing Systems

Glitching in a Carry Ripple Adder

Lan-Da Van

VLSI-DSP-14-22

VLSI Digital Signal Processing Systems

Chain vs Tree Datapath (1/2)


A B O1 C O2 F D

A B B C

O1 F O2

Chain O1 P1 (Chain) P0=1-P1 (Chain) P0->1 (Chain) P1 (Tree) P0=1-P1 (Tree) P0->1 (Tree) 1/4 3/4 3/16 1/4 3/4 3/16 O2 1/8 7/8 7/64 1/4 3/4 3/16
Lan-Da Van

Tree F 1/16 15/16 15/256 1/16 15/16 15/256


VLSI-DSP-14-23

VLSI Digital Signal Processing Systems

Chain vs Tree Datapath (2/2)


A B O1 C O2 F D

A B B C

O1 F O2

Chain O1 P0->1 (Chain)/P0->1 (Tree) 0->1 (Chain)/0->1 (Tree) 1 1 O2 0.58 0.83

Tree

F 1 1.47

Which one is for Low-Power design?


Lan-Da Van VLSI-DSP-14-24

VLSI Digital Signal Processing Systems

Glitching at the Datapath Level


Irregular Regular

Two Glitches!

Lan-Da Van

VLSI-DSP-14-25

VLSI Digital Signal Processing Systems

How to Minimize Glitching?

Equalize Length of Timing Paths through Design!


Lan-Da Van VLSI-DSP-14-26

VLSI Digital Signal Processing Systems

Data Representation (1/2)

Bit Position

Bit Position

Lan-Da Van

VLSI-DSP-14-27

VLSI Digital Signal Processing Systems

Data Representation (2/2)


(Binary v.s. Gray Encoding)

Lan-Da Van

VLSI-DSP-14-28

VLSI Digital Signal Processing Systems

Outline
Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design Conclusion References

Lan-Da Van

VLSI-DSP-14-29

VLSI Digital Signal Processing Systems

Signal Reordering Operation


Ex1. Y=AB+AC= A(B+C) Ex2. Y=3X=X+(X<<1) B
X

B +
+
X

C X
X

<<1

Y
Lan-Da Van

+ X

VLSI-DSP-14-30

Resource Sharing Can Increase Activity (1/2)


Separate Bus Structure

VLSI Digital Signal Processing Systems

# of Bus Transitions Per Cycle =2(1+1/2+1/4+.)=4,

Where 2 means 2 separate buses, 1 denotes the transition probability of LSB, denotes the transition probability of 2nd LSB, and etc.

Bus Sharing

Lan-Da Van

VLSI-DSP-14-31

Resource Sharing Can Increase Activity (2/2)

VLSI Digital Signal Processing Systems

Bit Position
Lan-Da Van VLSI-DSP-14-32

VLSI Digital Signal Processing Systems

Lowering Vdd Increases Delay

Lan-Da Van

VLSI-DSP-14-33

VLSI Digital Signal Processing Systems

Reducing Vdd

Lan-Da Van

VLSI-DSP-14-34

VLSI Digital Signal Processing Systems

Architecture Trade-offs: Reference Datapath

Lan-Da Van

VLSI-DSP-14-35

VLSI Digital Signal Processing Systems

Parallel Datapath

Lan-Da Van

VLSI-DSP-14-36

VLSI Digital Signal Processing Systems

Pipelined Datapath

Lan-Da Van

VLSI-DSP-14-37

VLSI Digital Signal Processing Systems

Summary: A Low-Power Data Path


Architecture type Reference Datapath (no pip/par)
Pipelined datapath Parallel datapath Pipeline-parallel datapath

Voltage 5V
2.9V 2.9V 2.0V

Area 1
1.3 3.4 3.7

Power 1
0.37 0.34 0.18

Desire to operate at lowest possible speeds (using low supply voltages) Use architecture optimization to compensate for slower operation
Lan-Da Van VLSI-DSP-14-38

Computational Complexity of DCT Algorithms

VLSI Digital Signal Processing Systems

Lan-Da Van

VLSI-DSP-14-39

Low-Power Cache and Register Configuration


Application profiling

VLSI Digital Signal Processing Systems

Trade-off between performance, power and size Access and storage the most frequently used instructions Avoid accessing larger cache/register Partition cache and register Aware of partitioning

Rule of thumb

Partition!

Partition!

CPU

Reg Reg

L1 Cache

L2 Cache

Memory

Lan-Da Van

VLSI-DSP-14-40

VLSI Digital Signal Processing Systems

Outline
Introduction Low-Power Process-Level Design (Ignore here) Low-Power Logic/Circuit-Level Design Low-Power Algorithm/Architecture-Level Design Low-Power System-Level Design

Low Power System Perspective Low Power Applications

Conclusion References

Lan-Da Van

VLSI-DSP-14-41

VLSI Digital Signal Processing Systems

Power Down Techniques

Lan-Da Van

VLSI-DSP-14-42

VLSI Digital Signal Processing Systems

Software versus Hardware


Advantage
Software

Disadvantage

Hardware

Free but not always High power consumption High flexibility Ease of compatibility Slow in execution Inefficient Larger staff High speed High die cost Low power Low flexibility High efficiency Low compatibility Less staff
Lan-Da Van VLSI-DSP-14-43

VLSI Digital Signal Processing Systems

Energy-Efficient Software Coding


Potential for power reduction via software modification is relatively unexploited. Code size and algorithmic efficiency can significantly affect energy dissipation. Pipelining at software level- VLIW coding style References:

V. Tiwari et al., Power analysis of embedded software: a first step towards software power minimization, IEEE Trans. on VLSI, vol. 2, no. 4, Dec. 1994. J. Synder et al., Low-power software for low-power people, 1994 IEEE Symp. On Low Power Electronics.

Lan-Da Van

VLSI-DSP-14-44

VLSI Digital Signal Processing Systems

Power Hunger Clock Network


H-Tree design deficiencies based on Elmore delay model. PLL every designer (digital or analog) should have the knowledge of PLL.

Multiple frequencies in chips/systems by PLL Low main frequency, But Jitter and noise, gain and bandwidth, pull-in and lock time, stability

Asynchronous => Use gated clocks, sleep mode

Lan-Da Van

VLSI-DSP-14-45

VLSI Digital Signal Processing Systems

Power Analysis in the Design Flow

Lan-Da Van

VLSI-DSP-14-46

Applications I: Wireless Computing/Communication

VLSI Digital Signal Processing Systems

Lan-Da Van

VLSI-DSP-14-47

Applications II: A Portable Multimedia Terminal

VLSI Digital Signal Processing Systems

Lan-Da Van

VLSI-DSP-14-48

Applications III: System on Chip (SOC)


Entire system function

VLSI Digital Signal Processing Systems

Logic + Memory More than two types of devices

Allow more freedoms in architecture Hardware and software partition

Lan-Da Van

VLSI-DSP-14-49

VLSI Digital Signal Processing Systems

Conclusions
Low-Power and high-speed tradeoff design is an essential requirement for many applications. Low power impacts on the cost, size, weight, performance, and reliability. Reduce P0->1 , CL, Vdd, and f for low power design across each level!!

Lan-Da Van

VLSI-DSP-14-50

VLSI Digital Signal Processing Systems

Reference
[1] A. Chandrakasan and R. W. Brodersen, Minimizing power consumption in digital CMOS circuits, Proceedings of the IEEE, vol. 83, no. 4, pp. 498-523, Apr. 1995. [2] A. Chandrakasan, Architectures for Ultra Low-Power Design, in tutorial B3 of ASP-DAC, 1995. [3] A. Chandrakasan, Low-Voltage/Low-Power Digital Design, in tutorial of Workshop on Low-Power Low-Volgate and RF IC for Wireless Communication System, 1996, Taiwan. [4] T. Sakurai, Low Power Circuit Design Methodology, in tutorial B2 of ASP-DAC, 1995.

[5] Chapter 17 of Textbook.

Lan-Da Van

VLSI-DSP-14-51

VLSI Digital Signal Processing Systems

Self-Test Exercises
STE1: Calculate the switching activity EQUATION EXPRESSION of 2-input AND gate and simulate the histogram of transition probability (P0->1) vs PA and PB. STE2: Calculate the switching activity EQUATION EXPRESSION of 3-input NAND gate.

Lan-Da Van

VLSI-DSP-14-52

You might also like