You are on page 1of 52

VLSI Digital Signal Processing Systems

Low-Power CMOS VLSI Design

Lan-Da Van (), Ph. D.


Department of Computer Science,
National Chiao Tung University,
Taiwan, R.O.C.
Fall, 2010

ldvan@cs.nctu.edu.tw

http://www.cs.nctu.tw/~ldvan/
VLSI Digital Signal Processing Systems

Outline
Introduction
Low-Power Process-Level Design (Ignore here)
Low-Power Logic/Circuit-Level Design
Low-Power Algorithm/Architecture-Level Design
Low-Power System-Level Design
Conclusion
References

Lan-Da Van VLSI-DSP-14-2


VLSI Digital Signal Processing Systems

Low Power Design An Ongoing and


Important Discipline
Historical figure of merit for VLSI design
Performance (circuit speed and system quality)
Chip area (circuit cost). But now,
Power dissipation is now an important metric in VLSI
design.
No single major source for power savings across all design
levels Required a new way of THINKING!!!
Companies lack the basic power-conscious culture and
designers need to be educated in this respect.
Overall Goal - To reduce power dissipations but
maintaining adequate throughput rate.

Lan-Da Van VLSI-DSP-14-3


VLSI Digital Signal Processing Systems

Motivation - Microprocessor

Lan-Da Van VLSI-DSP-14-4


VLSI Digital Signal Processing Systems

Low Power Competitive Reasons


Battery Powered Systems
Extend battery life
Reduce weight and size
High-Performance Systems
Cost
Package (chip carrier, heat sink, card slots, )
Power Systems (supplies, distribution, regulators, )
Fans (noise, power, reliability, area, )
Operating cost to customer Re-start issue.
Reliability
Failure rate increases by 4X for T @ 110C vs 70C
Size and Weight

Lan-Da Van VLSI-DSP-14-5


VLSI Digital Signal Processing Systems

The Power Crisis: Portability

PDA, Cellular Phone, Expected Battery Lifetime increase


Notebook Computer,etc. Over next 5 years: 30-40%

Lan-Da Van VLSI-DSP-14-6


VLSI Digital Signal Processing Systems

A Multimedia Terminal: The Infopad

Present day battery technology (year 1990)


20 lbs for 10hrs
Lan-Da Van VLSI-DSP-14-7
VLSI Digital Signal Processing Systems

VLSI Signal Processing System


Design Space

Cost Performance

System Level

Test Area Algorithm Level


Architecture Level
Logic Level

Power Circuit Level


Process Level

Lan-Da Van VLSI-DSP-14-8


VLSI Digital Signal Processing Systems

Low Power System Design Space

Power budgeting, S/H partitioning, power


System management, core selection

Algorithmic reduction, data transformation, CSE,


Algorithm low-complexity operation

Parallelism, pipelining, re-timing, unfolding, signal


Architecture ordering, glitch minimization, data representation,
resource allocation, multi-clock
Logic style, arithmetic, glitch/noise minimization,
Logic/Circuit re-sizing, adaptive voltage scaling, multi-Vdd,
multi-Vth, multi-clock, layout, power-driven P&R

Process Low-power device, alternative technology, multi-


Vth

Lan-Da Van VLSI-DSP-14-9


VLSI Digital Signal Processing Systems

Outline
Introduction
Low-Power Process-Level Design (Ignore here)
Low-Power Logic/Circuit-Level Design
Low-Power Algorithm/Architecture-Level Design
Low-Power System-Level Design
Conclusion
References

Lan-Da Van VLSI-DSP-14-10


VLSI Digital Signal Processing Systems

Where Does Power Go in CMOS?


Source of power dissipation
P = Pdynamic + Pshort-circuit + Pleakage + Pstatic
Definitions:
Dynamic/switching power: P = CV2f
Charging and discharging parasitic capacitors
: switching activity factor
Short circuit power P = IscV
Direct path between supply rail during switching
Leakage power P = IleakageV
Reverse bias diode leakage
Sub-threshold conduction
Static power P = IstaticV
Each input node is connected to fixed stable voltage

Lan-Da Van VLSI-DSP-14-11


VLSI Digital Signal Processing Systems

Dynamic Power Consumption (1/2)

Power = Energy/transition * transition rate


= CL * Vdd2 * f0->1
= CL * Vdd2 * Pb0->1 * f
= CEFF * Vdd2 * f
= Pb0->1 *CL*Vdd2 * f
CEFF = Effective Capacitance = CL * Pb0->1

Lan-Da Van VLSI-DSP-14-12


VLSI Digital Signal Processing Systems

Dynamic Power Consumption (2/2)


Need to reduce Pb0->1 , CL, Vdd, and f for low
power design
Reduce the probability, P0 -> 1
Minimize the geometry and remove the
redundancy
Reduce the power supply level
Use lowest clock frequency
Power dissipation is data dependent function
of switching activity. => Pattern Dependent!

Lan-Da Van VLSI-DSP-14-13


VLSI Digital Signal Processing Systems

Choice of Logic Style

Lan-Da Van VLSI-DSP-14-14


VLSI Digital Signal Processing Systems

Choice of Logic Style


Power-delay product improves as voltage decreases
The best logic style minimizes power-delay (i.e,
energy) for a given delay constraint.

Lan-Da Van VLSI-DSP-14-15


VLSI Digital Signal Processing Systems

Type of Logic Function: NOR


Example : Static-style 2-input NOR gate

A B Out Assume :
0 0 1 P(A=1) =
P(B=1) =
0 1 0
Then :
1 0 0 P(Out=1) =
1 1 0 P(01)
= P(Out=0)*P(Out=1)
Truth Table of 2-Input NOR Gate =3/4 * 1/4 = 3/16
0->1 = 3/16

Lan-Da Van VLSI-DSP-14-16


VLSI Digital Signal Processing Systems

2-Input NOR Gate Transition Probability

P1=(1-PA)(1-PB)
P0->1 =P0P1=(1-(1-PA)(1-PB))(1-PA)(1-PB)
Lan-Da Van VLSI-DSP-14-17
VLSI Digital Signal Processing Systems

Type of Logic Function: XOR


Example : Static-style 2-input XOR gate

Assume :
A B Out
P(A=1) = 1/2
0 0 0 P(B=1) = 1/2
0 1 1 Then :
P(Out=1) = 1/2
1 0 1 P(01)
1 1 0 = P(Out=0)*P(Out=1)
=1/2 * 1/2 = 1/4
Truth Table of 2-Input XOR Gate 0->1 = 1/4

Lan-Da Van VLSI-DSP-14-18


VLSI Digital Signal Processing Systems

2-Input XOR Gate Transition Probability

P1=PA(1-PB)+PB(1-PA)=PA+PB-2PAPB
P0->1 =P0P1=(1-(PA+PB-2PAPB))(PA+PB-2PAPB)
Lan-Da Van VLSI-DSP-14-19
VLSI Digital Signal Processing Systems

Which One is Your Choice?

XOR NOR

Which one is for Low-Power design?


Lan-Da Van VLSI-DSP-14-20
VLSI Digital Signal Processing Systems

Glitching Activity in CMOS Network

(x,c=0,0) (x,c=1,0)

0->1 can be greater than 1 due to glitching!

Lan-Da Van VLSI-DSP-14-21


VLSI Digital Signal Processing Systems

Glitching in a Carry Ripple Adder

Lan-Da Van VLSI-DSP-14-22


VLSI Digital Signal Processing Systems

Chain vs Tree Datapath (1/2)


O1
A O1 A
O2 B
B
F F
C B
D
C
O2
Chain Tree

O1 O2 F
P1 (Chain) 1/4 1/8 1/16
P0=1-P1 (Chain) 3/4 7/8 15/16
P0->1 (Chain) 3/16 7/64 15/256
P1 (Tree) 1/4 1/4 1/16
P0=1-P1 (Tree) 3/4 3/4 15/16
P0->1 (Tree) 3/16 3/16 15/256

Lan-Da Van VLSI-DSP-14-23


VLSI Digital Signal Processing Systems

Chain vs Tree Datapath (2/2)


O1
A O1 A
O2 B
B
F F
C B
D
C
O2
Chain Tree

O1 O2 F
P0->1 (Chain)/P0->1 (Tree) 1 0.58 1
0->1 (Chain)/0->1 (Tree) 1 0.83 1.47

Which one is for Low-Power design?

Lan-Da Van VLSI-DSP-14-24


VLSI Digital Signal Processing Systems

Glitching at the Datapath Level


Irregular Regular

Two Glitches!

Lan-Da Van VLSI-DSP-14-25


VLSI Digital Signal Processing Systems

How to Minimize Glitching?

Equalize Length of Timing Paths through Design!

Lan-Da Van VLSI-DSP-14-26


VLSI Digital Signal Processing Systems

Data Representation (1/2)

Bit Position Bit Position

Lan-Da Van VLSI-DSP-14-27


VLSI Digital Signal Processing Systems

Data Representation (2/2)

(Binary v.s. Gray Encoding)

Lan-Da Van VLSI-DSP-14-28


VLSI Digital Signal Processing Systems

Outline
Introduction
Low-Power Process-Level Design (Ignore here)
Low-Power Logic/Circuit-Level Design
Low-Power Algorithm/Architecture-Level Design
Low-Power System-Level Design
Conclusion
References

Lan-Da Van VLSI-DSP-14-29


VLSI Digital Signal Processing Systems

Signal Reordering Operation


Ex1. Y=AB+AC= A(B+C)
Ex2. Y=3X=X+(X<<1)
B B
X +
A + Y C X Y
X A

C
X <<1
X Y
+ Y
3 X
Lan-Da Van VLSI-DSP-14-30
VLSI Digital Signal Processing Systems

Resource Sharing Can Increase


Activity (1/2)
Separate Bus Structure

# of Bus Transitions Per Cycle


=2(1+1/2+1/4+.)=4,
Where 2 means 2 separate buses, Bus Sharing
1 denotes the transition probability
of LSB, denotes the transition
probability of 2nd LSB, and etc.

Lan-Da Van VLSI-DSP-14-31


VLSI Digital Signal Processing Systems

Resource Sharing Can Increase


Activity (2/2)

Bit Position
Lan-Da Van VLSI-DSP-14-32
VLSI Digital Signal Processing Systems

Lowering Vdd Increases Delay

Lan-Da Van VLSI-DSP-14-33


VLSI Digital Signal Processing Systems

Reducing Vdd

Lan-Da Van VLSI-DSP-14-34


VLSI Digital Signal Processing Systems

Architecture Trade-offs: Reference


Datapath

Lan-Da Van VLSI-DSP-14-35


VLSI Digital Signal Processing Systems

Parallel Datapath

Lan-Da Van VLSI-DSP-14-36


VLSI Digital Signal Processing Systems

Pipelined Datapath

Lan-Da Van VLSI-DSP-14-37


VLSI Digital Signal Processing Systems

Summary: A Low-Power Data Path

Architecture type Voltage Area Power

Reference Datapath (no 5V 1 1


pip/par)

Pipelined datapath 2.9V 1.3 0.37

Parallel datapath 2.9V 3.4 0.34

Pipeline-parallel datapath 2.0V 3.7 0.18

Desire to operate at lowest possible speeds (using


low supply voltages)
Use architecture optimization to compensate for
slower operation
Lan-Da Van VLSI-DSP-14-38
VLSI Digital Signal Processing Systems

Computational Complexity of DCT


Algorithms

Lan-Da Van VLSI-DSP-14-39


VLSI Digital Signal Processing Systems

Low-Power Cache and Register


Configuration
Application profiling
Trade-off between performance, power and size
Rule of thumb
Access and storage the most frequently used instructions
Avoid accessing larger cache/register
Partition cache and register
Aware of partitioning

Partition! Partition!

CPU Reg L1 Cache L2 Cache Memory


Reg

Lan-Da Van VLSI-DSP-14-40


VLSI Digital Signal Processing Systems

Outline
Introduction
Low-Power Process-Level Design (Ignore here)
Low-Power Logic/Circuit-Level Design
Low-Power Algorithm/Architecture-Level Design
Low-Power System-Level Design
Low Power System Perspective
Low Power Applications
Conclusion
References

Lan-Da Van VLSI-DSP-14-41


VLSI Digital Signal Processing Systems

Power Down Techniques

Lan-Da Van VLSI-DSP-14-42


VLSI Digital Signal Processing Systems

Software versus Hardware


Advantage Disadvantage

Software Free but not always High power


High flexibility consumption
Ease of compatibility Slow in execution
Inefficient
Larger staff
Hardware High speed High die cost
Low power Low flexibility
High efficiency Low compatibility
Less staff

Lan-Da Van VLSI-DSP-14-43


VLSI Digital Signal Processing Systems

Energy-Efficient Software Coding


Potential for power reduction via software
modification is relatively unexploited.
Code size and algorithmic efficiency can significantly
affect energy dissipation.
Pipelining at software level- VLIW coding style
References:
V. Tiwari et al., Power analysis of embedded software: a first
step towards software power minimization, IEEE Trans. on
VLSI, vol. 2, no. 4, Dec. 1994.
J. Synder et al., Low-power software for low-power people,
1994 IEEE Symp. On Low Power Electronics.

Lan-Da Van VLSI-DSP-14-44


VLSI Digital Signal Processing Systems

Power Hunger Clock Network


H-Tree design deficiencies based on Elmore delay
model.
PLL every designer (digital or analog) should have
the knowledge of PLL.
Multiple frequencies in chips/systems by PLL
Low main frequency, But
Jitter and noise, gain and bandwidth, pull-in and lock time,
stability
Asynchronous => Use gated clocks, sleep mode

Lan-Da Van VLSI-DSP-14-45


VLSI Digital Signal Processing Systems

Power Analysis in the Design Flow

Lan-Da Van VLSI-DSP-14-46


VLSI Digital Signal Processing Systems

Applications I: Wireless
Computing/Communication

Lan-Da Van VLSI-DSP-14-47


VLSI Digital Signal Processing Systems

Applications II: A Portable


Multimedia Terminal

Lan-Da Van VLSI-DSP-14-48


VLSI Digital Signal Processing Systems

Applications III: System on Chip


(SOC)

Entire system function


Logic + Memory
More than two types of
devices
Allow more freedoms in
architecture
Hardware and software
partition

Lan-Da Van VLSI-DSP-14-49


VLSI Digital Signal Processing Systems

Conclusions
Low-Power and high-speed tradeoff design is an essential
requirement for many applications.
Low power impacts on the cost, size, weight, performance,
and reliability.
Reduce P0->1 , CL, Vdd, and f for low power design
across each level!!

Lan-Da Van VLSI-DSP-14-50


VLSI Digital Signal Processing Systems

Reference
[1] A. Chandrakasan and R. W. Brodersen, Minimizing power
consumption in digital CMOS circuits, Proceedings of the
IEEE, vol. 83, no. 4, pp. 498-523, Apr. 1995.
[2] A. Chandrakasan, Architectures for Ultra Low-Power
Design, in tutorial B3 of ASP-DAC, 1995.
[3] A. Chandrakasan, Low-Voltage/Low-Power Digital Design,
in tutorial of Workshop on Low-Power Low-Volgate and RF IC
for Wireless Communication System, 1996, Taiwan.
[4] T. Sakurai, Low Power Circuit Design Methodology, in
tutorial B2 of ASP-DAC, 1995.
[5] Chapter 17 of Textbook.

Lan-Da Van VLSI-DSP-14-51


VLSI Digital Signal Processing Systems

Self-Test Exercises
STE1: Calculate the switching activity EQUATION
EXPRESSION of 2-input AND gate and simulate the
histogram of transition probability (P0->1) vs PA and PB.
STE2: Calculate the switching activity EQUATION
EXPRESSION of 3-input NAND gate.

Lan-Da Van VLSI-DSP-14-52

You might also like