isecientforoptimizingtherequirednumberofad- ditions.Whilemultiplicationswithaconstantvalue arecommonlyperformedusingshiftsandadditions, researchonthejointoptimizationofmultiplecon- stantsisrelativelynew.Potkonjak4presentsagen- eralizedsystemnamedmultipleconstantmultipliers MCMs.Thisapproachcanbeappliedtoanyprob- lemwhereanunknownismultipliedbymorethanone constantvalue,andcanoptimizeforboththenumber ofadditionsandthenumberofrequiredshifts.Fe- her6concentratesongeneratingvectorproductcir- cuitsforFPGAs,andappliesthetechniquestothe two-dimensionalDCT.Feher'sworkimplementsonly bit-serialdesigns,whichnotonlysimpliestheprob- lemofhighfanoutnodes,but,duetothenecessary parallel-serialconversion,isonlyofinterestwhenthe applicationcanaordalowthroughput.Furthermore, Feher'salgorithmutilizesapairwisegreedyapproach, whichprovidespoorresultsformanyproblemcases. Additionalworkincludesrecodingformultiplecoe- cients5andoptimizationsforDSParchitectures3. Thesetechniques,however,donotexploitthefullpo- tentialofFPGAs,especiallyinregardstorun-time recongurationRTR.
3ConstantMultiplierTrees
ForthispaperwewillapplytheMCMTtechnique toproblemsinvolvingthevectorproduct,
Y
=
K
,
1
X
k
=0
A
k
X
k
1 where
X
isanunknownvector,butthevector
A
isknownatcompiletime.Althoughwedevelopa methodologybasedonthissimplevectorproduct,the sametechniquesareapplicabletoanymultiplecon- stantmultiplicationproblem. Webeginbydecomposingthevectorproductina mannerthattakesadvantageofour
apriori
knowledge oftheconstantvector
A
.Specically,wewrite
A
k
=
L
,
1
X
l
=0
a
lk
2
l
;
where
a
lk
isasinglebitand
L
istheprecisionofthe coecients.Weassumeforsimplicitythat
A
k
isa positiveinteger,butthederivationcanbeappliedto signedxed-pointsystemsaswell.Substitutingthis into1gives
Y
=
K
,
1
X
k
=0
X
k
L
,
1
X
l
=0
a
lk
2
l
!
;
andafterexchangingtheorderofsummations,we have
Y
=
L
,
1
X
l
=0
K
,
1
X
k
=0
X
k
a
lk
!
2
l
:
2 Notethat
a
L
,
1
;
k
denotesthemostsignicantbitof thecoecient
A
k
. Therearesomeimportantfeaturesof2that shouldbenoted.First,nohardwaremultipliersare needed,since
a
lk
takesonvalues0or1,leavingonly shiftstoimplementthe2
l
multiplication.Second, thedecompositionof
A
intobit-planesbyequation 2allowsforsimpleexploitationofthesymmetries andcommonsubexpressionscontainedinthecoe- cientdata.Moreover,since
a
lk
isknownatcompile time,weperformonlytheadditionsthatarenecessary duringruntime. Inordertogeneratetreesfromavectorproduct,it isusefultorepresent2inthefollowingmanner:
2 4
Y
0
Y
1
::: Y
L
,
1
3 5
=
2 4
a
0
;
0
a
0
;
1
:::a
0
;K
,
1
a
1
;
0
::: :::::: a
L
,
1
;
0
a
L
,
1
;K
,
1
3 5 2 4
X
0
X
1
::: X
K
,
1
3 5
;
withthedesiredresult
Y
=
Y
0
+
Y
1
2
1
+
:::
+
Y
L
,
1
2
L
,
1
:
Denotethecoecientmatrixof
a
lk
elementsforout- put
Y
as
a
Y
.Thismatrixwillbethebasisofallop- timizationsandtransformationsinthispaper.Note that
a
Y
consistsofbinarydataonly,withthetoprow containingtheleastsignicantbitsofthe
K
elements of
A
,andthebottomrowcontainingtheirmostsig- nicantbits. Theproductofeachrowof
a
Y
with
X
iscomputed usingabinarytree;ifarowof
a
Y
containsoneormore 0's,thenthecorrespondingtreewillbeaprunedtree. Regardless,the
L
treesY
0
,Y
1
,...Y
L
,
1
aresummedto- getherusingonenaladdertreecombinedwithshifts toproducethevalueof
Y
.Forsimplicity,ourgures showalinearizedversionofthisnaltree;inpractice abinarytreewilloftenbeused3. Asanexampleusedthroughoutthissection,sup- posewewishtocompute
Y
=
X
0
+2
X
1
+3
X
2
+4
X
3
+5
X
4
+6
X
5
+7
X
6
+8
X
7
:
Inthiscase
A
=12345678,and
a
Y
is
2 6 4
10101010 01100110 00011110 00000001
3 7 5
:
3
Add a Comment