Professional Documents
Culture Documents
Traditional Pipelining
Goal: Maximum performance
Vdd
Clk-Q Propagation Delay Clk Setup
Clk
Clk
Time Slack
Clk
Time Slack
Clk
Time Slack
Clk
Time Slack
Clk
Pipelining
Time slack
Delay
Power Saving
Delay
Power-Optimal Pipelining
Power reduction from pipelining limited by power overhead of increased number of flip-flops PowerOptimal Pipelining
Power-Optimal Pipelining
Power reduction from pipelining limited by power overhead of increased number of flip-flops PowerOptimal Pipelining Power
Delay
Power-Optimal Pipelining
Power reduction from pipelining limited by power overhead of increased number of flip-flops PowerOptimal Pipelining Power Too deep pipelining
Delay
Power-Optimal Pipelining
Power reduction from pipelining limited by power overhead of increased number of flip-flops PowerOptimal Pipelining Power Too deep pipelining
Contribution
Pipelining is an old idea. Research focus has been on performance impact of pipelining. Idea of using pipelining [Chandrakasan 92] to lower power has not been fully explored in deep submicron technology. Analysis and circuit-level simulation of Power-Optimal Pipelining for different regimes of Vth, activity factor, clock gating
Bottom-to-Top Approach
1. Impact of pipelining on power component 2. Impact of pipelining on total power (with/without clock-gating) Power Total Power (clock-gated)
active inactive active
Time
Bottom-to-Top Approach
1. Impact of pipelining on power component 2. Impact of pipelining on total power (with/without clock-gating) Power Total Power (not clock-gated)
active inactive active
Time
*Idle power = power consumed when circuit is idle and not clock-gated
Methodology
Target digital system: Fixed throughput, Highly parallel computation, Logic-dominant Test bench
o
BPTM (Berkeley Predictive Technology Model) 70nm process: o LVT(0.17/-0.2), MVT(0.19/-0.22), HVT(0.21/-0.24) o Hspice simulation at 100C, Clock = 2 GHz Baseline N FO4 inverters (N = 2 ~ 24) TG flip-flops
TG flip-flops
Optimal Saving O(1/N) Flip-flop overhead Leakage Power Optimal FO4 O(N ) (1< < 2) Superlinear reduction of logic leakage power Vdd * e( Vdd) N DIBL effect
Increased control complexity insufficient setup time of clock enable signal Between leakage power scaling and flip-flop switching power scaling depending on leakage level
Idle Power
Optimal FO4
ative Power
O(1/N)
1/N * Vdd2
70(LVT)~ 75(HVT)% 6
55(HVT)~ 70(LVT)% 8
N = Number of N* = Optimal N FO4 inverters per stage (Not including flip-flop delay)
relative power
relative power
activity factor
activity factor
relative power
relative power
Switching Power
Switching Power
activity factor
activity factor
relative power
relative power
LVT
activity factor activity factor
Discussion
LVT can be fast and power-efficient
o
Flip-flop delay more important than flip-flop power for power-optimal pipelining
Conclusion
Pipelining is an effective low-power tool when used to support voltage scaling in digital system implementing highly parallel computation. Optimal Logic Depth: 6-8 FO4
o
~ 8-10 FO4 including flip-flop delay It depends on Vth, AF, Clock-Gating Pipelining is more effective with High AF
Pipelining is most effective at saving switching power
Insights:
o o o
Acknowledgments
Thanks to SCALE group members and anonymous reviewers Funded by NSF CAREER award CCR0093354, NSF ITR award CCR-0219545, and a donation from Intel Corporation.
BACKUP SLIDES