Professional Documents
Culture Documents
Outline
MOTIVATION
OVERVIEW OF ERROR-RESILIENT PARADIGMS
APPROXIMATE ARITHMETIC CIRCUITS
Approximate adders
Approximate multipliers
Metrics for approximate computing
Quality-energy optimal designs
Approximate logic synthesis
Embedded Tutorial, ETS’13, Avignon, France Please use with attribution to the authors. © 2013 Han and Orshansky
1
2014/6/25
Motivation
Energy-efficiency is of paramount concern in digital system
design
Computing becomes increasingly heavy with media
processing (audio, video, graphics, and image), recognition,
and data mining
A common characteristic: a perfect result is
not necessary and an approximate or
less-than-optimal result is sufficient
Signal processing: image, video, speech
o Human perception is not sensitive
to high frequency changes
o Natural noise floor due to quantization noise
Search, machine learning, data mining
Optimization
2
2014/6/25
Error-Resilient Paradigms
3
2014/6/25
Probabilistic Computing
A proposal for using physically unreliable devices
Probabilistic switches and probabilistic Boolean logic developed
from a thermodynamic perspective
Probabilistic CMOS (PCMOS) family of circuits
A recent philosophical introduction in [Palem12]
4
2014/6/25
Approximate Computing
Employs deterministic designs that produce imprecise results
Outline
MOTIVATION
OVERVIEW OF ERROR-RESILIENT PARADIGMS
APPROXIMATE ARITHMETIC CIRCUITS
Approximate adders
Approximate multipliers
Metrics for approximate computing
Quality-energy optimal designs
Approximate logic synthesis
5
2014/6/25
Vdd
Cin
A Sum
GND
B Cin
Cout
A B A
B A
6
2014/6/25
7
2014/6/25
8
2014/6/25
(a) Inaccurate
Truth table for the approximate 2 × 2 multiplier 2 × 2 multiplier
9
2014/6/25
100 1 1
101 vs.
001 1 4
10
2014/6/25
Normalized error distance (NED) vs. the number of approximate bits in an adder.
11
2014/6/25
Power and precision tradeoffs: the product of power per bit and NED is shown by a dashed curve.
The arrow points to the direction for a better design with a more efficient power and accuracy tradeoff.
Outline
MOTIVATION
OVERVIEW OF ERROR-RESILIENT PARADIGMS
APPROXIMATE ARITHMETIC CIRCUITS
Approximate adders
Approximate multipliers
Metrics for approximate computing
Quality-energy optimal designs
Approximate logic synthesis
12
2014/6/25
Timing budget = k
Under reduced budget, some bits do not have access to primary carry-in
In a TSA, actual accessible carry depends on previous cycle and is unknown
13
2014/6/25
FIC AFIC
14
2014/6/25
20
39.1dB
10 36.4dB
33.7dB 31.0dB
48.4dB 28.9dB
45.6dB
0 39.1dB 40.9dB
Alignment of segments means that effective carry chains are reduced for the
sum-bits below MSB
Increases probability of individual and thus overall error
For quadratic quality metrics, minimization of error magnitude is more important
As long as timing budget > 4bits, AFIC is better than FIC
15
2014/6/25
50 RCA
57 Opt.
CLA
Min.
45 KS
54
40
Opt. h=11 Exact
35 Min.
CUB
51
30
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
16
2014/6/25
Outline
MOTIVATION
OVERVIEW OF ERROR-RESILIENT PARADIGMS
APPROXIMATE ARITHMETIC CIRCUITS
Approximate adders
Approximate multipliers
Metrics for approximate computing
Quality-energy optimal designs
Approximate logic synthesis
17
2014/6/25
Prior work
ALS with error frequency constraint only [Shin10, Palem10]
o High runtime for large error frequencies
o Does not consider error magnitude
ALS with error magnitude constraint only [Miao12, Roy12]
o Based on unmodified conventional logic synthesis flow
o Does not consider error frequency
(1)
Our contributions:
Identify isomorphism of ALS w/ un-constrained error frequency
and Boolean Relation (BR) minimization problem
Develop an efficient algorithm to refine error magnitude-
constrained solution to satisfy error frequency constraint
18
2014/6/25
(2)
(3)
19
2014/6/25
(4)
20
2014/6/25
21
2014/6/25
Outline
MOTIVATION
OVERVIEW OF ERROR-RESILIENT PARADIGMS
APPROXIMATE ARITHMETIC CIRCUITS
Approximate adders
Approximate multipliers
Metrics for approximate computing
Quality-energy optimal designs
Approximate logic synthesis
22
2014/6/25
Row Coefficients
Level 1
Level 2
Level 3
Adaptive Voltage-Overscaling
Conventional design aims to
ensure timing correctness
Timing-error acceptance
philosophy
Worst-case timing is not
guaranteed
Prevent severe quality loss
23
2014/6/25
Principles
(1) Controlling large magnitude errors in operations by
exploiting the knowledge of statistics of operands
o
Frequency
250
50
width adder 40 200
30 150
10 50
components distribution 0
200 400 600 800 1000 1200
0
-300 -200 -100 0 100 200 300 400
450 350
400 300
350
Frequency
Frequency
250
300
250 200
200 150
150
100
100
50 50
0 0
© 2013 Han and Orshansky -200 -100 0 100 200 -80 -60 -40 -20 0 20 40 60 80
Coeff(3, 3) Magnitude Coeff(5, 5) Magnitude
Embedded Tutorial, ETS’13, Avignon, France
24
2014/6/25
Truncate & 45
sign extension 40
logic
35 20 dB
PSNR (dB)
30
55 J
Reg 25
+ 20
15 Unoptimized
T Reg 10 Variable-bitwidth
5
24bits 60 80 100 120 140 160 180 200 220 240 260
Energy (J)
25
PSNR (dB)
10 dB
Reg 20
Unoptimized
Reordering
+ T/I
15
Reg
10
D 0 100 200 300 400 500 600 700
Energy (J)
Embedded Tutorial, ETS’13, Avignon, France © 2013 Han and Orshansky
25
2014/6/25
30
25 Unoptimized
PSNR (dB)
(1) Remapping
(2) Reordering
20 (3) Rescheduling
Combined
15
10
0 100 200 300 400 500 600 700 800 900
Energy (J)
26
2014/6/25
Summary
Review of error-permissive paradigms: stochastic, probabilistic,
and approximate computing
Approximate arithmetic circuits (adders, multipliers)
Metrics for approximate computing
Introduced a formal model for timing starved addition
Conclusions
Design considerations
Identification of algorithm phases that allow errors
o E.g. control vs. data, error tolerant vs. critical computations
Identification of relevant error metric
o Error frequency vs. magnitude
Open question:
When will approximate computing principles be used in practice?
27