Professional Documents
Culture Documents
This reproduction was m ade from a copy o f a docum ent sent to us fo r m icrofilm ing.
While the m ost advanced technology has been used to photograph and reproduce
this d o cum ent, the quality o f the reproduction is heavily dependent upon the
quality o f the m aterial subm itted.
1 .T h e sign or “ targ et” fo r pages apparently lacking from the docum ent
photographed is “ Missing Page(s)” . If it was possible to obtain the missing
page(s) o r section, they are spliced into the film along w ith adjacent pages. This
may have necessitated cutting through an image and duplicating adjacent pages
to assure com plete continuity.
3. When a m ap, drawing o r ch art, etc., is p art o f the m aterial being photographed,
a definite m ethod o f “ sectioning” the m aterial has been follow ed. It is
custom ary to begin filming at the up p er left hand com er o f a large sheet and to
continue from left to right in equal sections with small overlaps. If necessary,
sectioning is continued again—beginning below the first row and continuing on
until com plete.
5. Some pages in any docum ent m ay have indistinct print. In all cases the best
available copy has been film ed.
University
Microfilms
International
300N .Z e e b Road
Ann Arbor, Ml 48106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
8220419
University
Microfilms
International 300 N. Zeeb Road, Ann Arbor, MI 48106
Copyright 1982
by
Ahmed, Hassan Masud
All Rights Reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
PLEASE NOTE:
In all ca ses this material has been filmed in the best possible way from the available copy.
Problems encountered with this document hava been identified here with a check mark V .
11. P age(s)____________ lacking when material received, and not available from school or
auihor.
15. O t h e r _________________________________________________ __
University
Microfilms
Internationa!
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
SIGNAL PROCESSING ALGORITHMS
AND
ARCHITECTURES
A DISSERTATION
OF STANFORD UNIVERSITY
DOCTOR OF PHILOSOPHY
H assan M. A hm ed
June 1982
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(c) C opyright 1982
by
H a ssa n M. Ahm ed
ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I c e rtify t h a t I have r e a d th is th e s is a n d t h a t in m y
opinion i t is fully a d e q u a te , in sc o p e a n d quality, as
a d is s e r ta tio n fo r th e d e g re e of D octor of Philosophy.
I c e rtify t h a t I have r e a d th is th e s is e n d t h a t in m y
opinion i t is fully a d e q u a te , in sc o p e a n d quality, as
a d is s e r ta tio n fo r th e d e g re e of D octor of Philosophy.
I c e rtify t h a t I have r e a d th is th e s is a n d t h a t in m y
opinion it is fully a d e q u a te , in sco p e a n d quality, as
a d is s e r ta tio n fo r th e d e g re e of D octor of Philosophy.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTRACT
factorization.
ili
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
F u rth e r sp e ed e n h a n ce m en ts th ro u g h th e u se of a newly developed m eth o d
th e algorithm s.
m en tio n ed above.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ACKN0WLE3>GoMENTS
in te ra c tio n s o n m y re se a rc h , I am v e ry g rate fu l.
a friend.
for those, as well as for his sp e ed y review, w hich allowed m e to " ru n off to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
of stim u latin g discussions a re in te g ral to th o s e c h a p te rs. V aluable
The lab s e c re ta rie s , esp ecially B arb ara, R achel, Kathy, C harlotte, Mieko
ro o m m a te s) Rich B a k e r an d P e te r Glynn.
C orporation, p a rtic u la rly Dr. G.D. F o rn ey and Dr. S.U.H. Qureshi have
e x p e rien c e in a lifetim e.
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
C hapter rage
1. INTRODUCTION ....................................................................................................... 1
BIBLIOGRAPHY ................................................................................................. 5
BIBLIOGRAPHY ................................................................................................. 16
APPENDICES .................................................................................................... 63
BIBLIOGRAPHY ................................................................................................. 66
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4. NUMERICAL ALGORITHMS .................................................................................... 69
viii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5. PARALLEL PROCESSORS FOR LINEAR ALGEBRA. .............................................. 144
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6.5 THE MICRO-CONTROLLER .............................................................................. 234
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF FIGURES
F ig u re Page
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1 R o ta tio n in G eneralized C oordinate System s ..................................... 71
x ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.17 T rav ersal O rdering ...................................................................................... 194
xiii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER ONE
INTRODUCTION
co m p u tin g m ach in e on a single chip. The c u rre n tly em erging V ery Large
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 2 -
th a n m ultiplication.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 3 -
a ccess m em o rie s (RAM’s)). The first and th ird ite m s have a pro fo u n d im p a c t
lite ra tu re (see [Ku77] fo r a good survey), how ever th e VLSI technology has
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
- 4 -
in te g ra te d c irc u it is m u c h so u g h t a fte r.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 5 -
alg o rith m s w hich m ay b e read ily im p lem en ted . The s tu d y of alg o rith m s
o u t to b e a m u c h la rg e r s e t th a n th e usual m ultiply an d a c c u m u la te
prim itiv es com m on in to d a y 's signal p ro ce sso rs. C hapter F o u r is dev o ted to
BIBLIOGRAPHY
M anual, 1978.
R e fe re n c e M anual, 1979.
[Bo80] J.R. Boddie, G.T. D aryanani, 1.1. E laum iati, K.N. Gadenz, J.3.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[Ch75] S.C. Chen, "S peedup of Ite ra tiv e P ro g ra m s in M ultiprocessor
[De82] J.M. D elosm e, "Algorithm s for F in ite Shift-R ank P ro c esse s," Ph.D
A lgebra,” SIAM R eview , Vol. 20, No. 4, O ctober 1978, pp. 740-776
[Ku79] H.T. Kung, "L et’s Design A lgorithm s for VLSI," Proc. o f the F irst
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 7 -
[LM80] D.T.L. Lee an d M. Morf, "Recursive Square-R oot L adder E stim atio n
Wesley, 1980
E ngineering, 1974.
1979.
[SK75a] A.H. Sam eh, D. Kuck, "L inear S y stem Solvers for P arallel
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
- 8 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 10 -
CHAPTER TWO
com m on of ali fre q u e n c y dom ain algorithm s. Its wide app licab ility in a re a s
w h iten ers [VT6 8 ]. These algorithm s enjoy as wide ap p licab ility as th e DFT.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 11 -
E z (n ) 0< k <N - 1
n= 0
* (* ) =
otherw ise
w here
WN = e-J'ai/jV
d e te rm in in g th e p ro c e ss of new (o r u n p re d ic ta b le ) in fo rm a tio n c o n ta in e d in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 12 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 13 -
delay elem en ts, gains an d sum m ing ju n ctio n s. The filte r coefficients a re
a d ju ste d to yield th e forw ard a n d backw ard p re d ic tio n e rro rs defined as:
r n .T = y T ~ n ~ l l s e ( l / f - n I \V k ] k = T - n ^ l )
data.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 14 -
£o.t = r o.r = V t
a _ %a . sn.T*1rn.T
“ n+1.7+1 _ * &n+l .T + “
7r.-l.T
_ Ar+1.7
^ " J#.T
IV - _ An+1.7*
-Kn.T-1
DC
Xn.T+l —
~ X f?z +
Kn.T 4-
_
1
7 » -l .T
Kk.T+1 = x Kk.T + T
4 '7+1
7 n - i.r
^n+l.r
7 i» « .r = 7 n .r -
f i.r
w here
7 „ .7 is a likelihood v ariable of n th o rd e r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 15 -
Order Updates:
_ vn.T ~ Pn+l.T V n .T -l
Vn +1.T ~ i- g r- g--------
"V 1-Pn+l.T V l ~ V n . T - l
_ V n .T -l ~ Pn+l.T Vn.T
n + lS V l - p | +i.7-V1-1/2.J-
Tim e Updates:
R t - R t - i [ X + h$ ]“*
Vt
V o .T - vo. t = —j -m-
"v R t - i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 16 -
th e la s t th r e e equations.'
BIBLIOGRAPHY
[CT65] J.W. Cooley, J.W. Tukey, "An A lgorithm for th e M achine Calculation
297-301
[Be74] A.M. Despain, "F ourier T ran sfo rm C om puters Using CORDIC
Ite ra tio n s," IE E E Trans. C om put., Vol. C-23, Oct. 1974, pp. 993-
1001 .
H ardw are Im p lem en tatio n ," IE E E Trans. Com puters, Vol C-28,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 17 -
[LM80] D.T.L. Lee an d M. Morf, "R ecursive Square-R oot L adder E stim atio n
Hall, 1975
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 18 -
CHAPTER THREE
sp e c tra l re p re s e n ta tio n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 19 -
m odelled as tim e varying lin ear filters (F igure 3.1), th e ag g reg atio n of w hich
p itc h p eriod.
co n sid erab le savings over d ire c t sto ra g e of PCM coded sam ples as will be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 20 -
impulse t r a i n
white noise
impulse t r a i n
M
SYNTHESIS FILTER /VA/
speech
w hite noise
R eflec tio n C o e f f i c ie n t s
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 21 -
8 KHz utilizing 8 b it PCM would req u ire 1280 bits! Clearly, th e u se of LPC in
n o t re q u ire d in m an y applications.
3 .1 .2 S p eech A nalysis
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 22 -
Memory S y n th es iz er — [ f l W
speech
C on tro lle r
impulse t r a i n (p itc h )
r \ r r\ : ANALYSIS FILTER
speech
w h ite n o is e (power)
R eflection C o e f f i c ie n t s
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 23 -
This th e s is will p rim a rily ad d ress th e firs t issue. Ite m (3) is of course a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 24 -
Rem ark:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 25 -
P n + l.r = V 1 —V n .T V 1 - V n . T - l P n + l.T - l + v n .T V n . T - l
,, _ v n .T ~ Pn+1.7- ^ w T - 1
Ti+l.r r 5 r 5
V i - P n t i J V l-rjn .T -i
_ V n .T -1 ~ P n + l.T v n .T
n* " T
h ard w are (e.g. a r r a y m ultipliers) o r req u irin g c o n sid e rab le ex ecu tio n tim e
im p le m e n ta tio n of th e la d d e r filter.
F o r n o ta tio n a l convenience, le t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- Zb
P ~ P n + l . T —1 • P+ ~ P n + l.T (3.1a)
V = Vn S , V+ - Vn+l'T (3.1b)
v+ = ( v - p + r f i / p l r f (3.2b)
Now, observe th a t:
(2) Let
if v
V = (3.3)
—v i f *- [ 58 * - [ S 7
w here V a n d N a re o rthogonal o r 2x2 ro ta tio n s. Then th e m a trix
p ro d u c t VAN is
if V 7f -7 7 i f r f p + vr\ i n f - p r\if
VAN = (3.4)
—V i f Ikd °1
iJ 77 77c i f 77 —p v r f i f r f + pvrj
p+ ' 1/ .
d o n 't (3.5)
V* care
I/. T}.
an d [1/+ 77+] = [1 0 ] /? (3.6a)
0 0
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 27 -
p r
w here R (3.6b)
0
-1 0
0 1,
th ro u g h and It is in te re s tin g to no te th e fu n d am e n ta l n a tu re of
3 .2 ADAPTIVE EQUALIZATION
[LSW65]. A com prom ise e q u a lize r is basically a fixed tra n s fe r function, high
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 28 -
Z* = E c*Tk -n
n
cA+1 = - Aen
w here
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Channel Disturbances
1
Modulator 1 x Demod
1 x
Channel Equalizer
a n d P a c k [SP80] a re p re s e n te d h e re .
an d le t
w ith
= 0 V i >n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 3] -
«(n )
r_{n) r , (n)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 32 -
CHANNEL-CORRELATlON MATRIX
EIGENVALUE R A T I O - 11
ALCE
Ui
UJ
u. <
C
O 3
2« -1.0
25 z< GRADIENT
Ui ALGORITHM
2
OPTIMUM
NUMBER OF ITERATIONS
CHANNEL-CORRELATlON MATRIX
EIGENVALUE RATIO - 21
0.0 ALCE
u.
O LSALE GRADIENT
ALGORITHM
C
o
.J
OPTIMUM
- 2.0
3.0
NUMBER OF ITERATIONS
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 33 -
a re quite sim ple w hen th e additive n o ise is w hite an d has a circu larily
w ith a d e te c to r which is optim al for w hite noise. This la tte r technique, losing
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 34 -
clocks a t equal freq u en cies, how ever a t different p h ases. T ran sitio n s in
p re s e n te d h e re .
and will b e p r e s e n te d elsew here. However, th e sim ple exam ple of tim ing
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 35 -
of th e in p u t p ro cess, y n is:
P y (s ) = (27r|2 |) n / 3 exP
P ro p o sitio n 3.1: The s ta tis tic , T ( y ) = y TE“ V is sufficient for th e fam ily
»«<nr» = (-K W i
* (« -
[Le59].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 36 -
V(t)
+V
1 1 1 1
t
0 0 0 0
Gaussian Noise
Modulator Demodulator
channel
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 37 -
P r o p o s itio n 3.2: y n = y TZ n ly = r ^ E ^ r ^ , (y n = 1 - y n )
w here:
7 n = 1 “ 7n
P ro o f: This was shown by Lee a n d Morf [LM80], assum ing th a t th e sam ple
sam p les.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 38 -
Proof: U n d er th e conditions s ta te d ,
r k .T - N(QM)
and
E [ r k .T Tl.T \ = 0 V k *1
-i
0 fl
R l
0
' T rn
_ y r i.T
i=i
but
~ 1) -» 7n.T ~ Xn
tra n sitio n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 39 -
U n d er th e conditions s ta te d ,
Tt.T N[0,Rif)
an d
E [rk j t i t ] = 0 Vk
ri.T
= 2
i=l Ri.T
but
n.r 0 , 1)
N { 7n .T ~ Xn
V r ~7
tra n sitio n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 40 -
At = i = t - tg
Proof:
b u t a t t = t- + k
n.t ~ N { 0.1) V i *k
7 n .i ~ E Xi + X?(Ar)
n —1
= Xn(Aj)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
41 -
p ro b a b ilitie s are:
(3.9)
t=i
Po = i - n Pui (3.10)
i=i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 42 -
■s/2 7 n ( 0 ) ~ N iy tZ n - 1, 1)
and
V 2 7n(X) ~ ff2)
w ith
— & V 2n - 1 (3.11)
n + X ' '
(3.12)
(3.13)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 43 -
given by:
Pp = P ro b [re je c t H0 | X = 0]
= e rfc (T - V 2 n - 1) (3.14)
an d
Pi - 1 - P ro b [a c c e p t H0 | X =X0 ^ 0]
1 -
- 1 - /r 7 e 1 ‘ d, x
VS tto5
= e rfc T ~ U (3.15)
e r f c x = 1 — - i —_ f e~xZ/z dx
V g i Jx
is th e c o m p le m e n ta ry e r r o r function.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 44 -
c
>-
to
Lu c
o
•r—
4->
.3
5-
+->
I/)
ro
U
c/>
on
•4-5
<0
00
a;
s-
3on
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 45 -
Rem arks:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
P ro b ab ility of False Alarm(Pp)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 47 -
:=.oi
Lambda (dS)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 48 -
40 -
20 .
10 .
Lambda (dB)
0 5 10 15 20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 49 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 50 -
S im ulations w ere also p e rfo rm e d for PSK and FSK signals fo r various
tra n sitio n , w hich provides a convenient way to apply E quations (3.9) and
even fo r low signal to noise ra tio s are quite profound. The effect of red u cin g
th e o rd e r of th e la d d e r is to in c re a se th e b a n d of noise a ro u n d (e.g
th e noise s p e c tru m .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- SI -
1.5
Signal + Noise
160 (tim e )
1 .5 T
0 160
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 52 -
0. 1
- 0.1
1 ,T
1.
0
0 160
Figure 3 .1 5 continu ed
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 53 -
0 200 (tim e)
2
3,T
2
0 200
Figure 3 .1 5 continu ed
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 54 -
Signal + Noise
1 .3 ,
M h 111*,
-1.3L
160 (tim e)
gamma.g j
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 55 -
1.0
0 .5
‘4 , 7
-0 .5
0 160
Figure 3 .1 6 continued
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 56 -
-2 L
160 (tim e )
2
5,T
2
160
Figure 3 . IS continued
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 57 -
O SO
0.00
CO •0 20
100 00 150.00
h-
CO
O 30
O 00 50 00 100 00 200 CO
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 58 -
C/5
O 00
o>
CO
•1 00
O.So
0 -K
O20
0 CO
-O 20
300
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 59 -
(/>
O
c
cr>
c/i
i so
so oo
o.
I—
CJ
•0 4 0
200.00
0 So
CM
0 10
0 001
0 CO 100 0%' 200.00
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 60 -
1 SOl
1 <.'Op;
+ n o is e
signal
-o.so
- i .00,
0 . 00. 100.00
CO
<
O 001
-o .
0. -101
SO. 00 200.00
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 61 -
1 50
+ n o is e
signal
- 1 00
-1.50
0.00
0 . 40
-O, 1 0
-0.30
50.00 100 00 150.00 300
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 62 -
te n th o rd e r filter.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 63 -
APPENDIX A
a r e b ia se d by th e noise sp ectru m .
Vn = * Vn- 1 (A.1)
w here
k = -1 a t a tra n sitio n
= +1 everyw here else
[Pa65]:
_ E iVT Vt- i)
*l.T (A.2)
•T - E&/..
{ y T- i y T-i)
.. K y = Jfc = 1
K'i .t = 1.
i=l
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 64 -
b u t E{Enm
xyr-i) - 0 i = l,2 n w hich leads to
cu = 1
Oj = 0 z = 2, 3 n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX B
L et
_ f 1 T < t0
UT ~ [ - 1 T 2: t 0
6? ~
P ro p o sitio n 3.4 a s s e r te d th a t
1 i = l
KZ.T = K* t = ' 0 i = 2. 3 7i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 66 -
BIBLIOGRAPHY
1960.
[Le59] E. L ehm ann, Testing S ta tis tic a l H ypotheses, J.Wiley and Sons,
1959.
[LM80] D. Lee, M. Morf, "A Novel Innovations B ased Time Dom ain P itc h
V erlag, 1976.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 67 -
856-859.
E ngineering. 1974.
57.6.1 - 57.6.5
S e p te m b e r, 1980.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 68 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 69 -
CHAPTER FOUR
NUMERICAL ALGORITHMS
p re s e n tly b ased on feist m ultiply and add c irc u its [AMI79] [BellBl]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 70 -
plane ro ta tio n s an d hyperbolic (or J-) ro ta tio n s on two dim ensional vecto rs.
by
1 0
R - y /x 2 + m y 2 = | | (2,v) I Is • 2 (4.1)
0 771
$ = yfrrL ta n ’( y V m / i ) (4.2)
4.1.
(4.3)
ro ta tio n . A fter 'n ' ite ratio n s, th e new ra d ia l a n d an g u lar com ponents of th e
v e c to r a re
= ?C - O. (4.4a)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 71 -
m= 1 lm=0 m =-l
S = Shaded Areo
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 72 -
Rn = R c * K (4.4b)
w here
n —1 it—1 ^
a = 2 A^t®i = 2 /iim -1/2tan-1(5iV m ) (4.5a)
:0
*fm = V Ki = “n 1 V l + m 5 ? (4.5b)
i= 0 i=0
2 i +1 = *£ “ (4.6)
(n o tice t h a t a* > 0 w ith th e sign of th e ro ta tio n b ein g ch o sen a t e a ch ste p
th ro u g h fj-i)
a c c u m u la te s th e n e t ro ta tio n ,
n=l
2n = 20 - 2 AfcOi (4.7)
i= 0
c o o rd in a te s y s te m of course) an d th is h as r e d u c e d th e n u m b er of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 73 -
CORDIC Functions
The
4.2:
Figure
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 74 -
Zj+l = Zi - SPiCLi
in th e appendix.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
y -----
/ ------- ----- 0
LINEAR ( m = 0 ) : z - 0
y — K .i(y c o sh f-x s in h j)
-----0
HYPERBOLIC (m = - 1 ) . ^ — 0 HYPERBOLIC ( m = - 1 ) : y - 0
i*i+ii = I l St l - « i i (4.8)
The sum of th e r e m aining ro ta tio n s a t e a c h s te p m u st be large enough
'n ' s te p s (This m u st be tru e even w hen = 0 and j$i+ i| = « i). This
c o n d itio n im plies:
ro ta tio n , i.e.,
n-1
I* . I — 2 ^ 1
J=0
*1—2
-» m ax | $0 | = a n _i + a,- (4.10)
j= 0
p r o c e e d as follows:
L e m m a 4.1 [Walther]:
P ro o f: By in d u ctio n
Tl-l
|$ i| < ^ CXj
3=i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 77 -
n -l
l$tl ~ < a n- 1 + 2 «;
J=i+1
n-1
-O i < |$ i| - oti < a n _! + 2 aj
j= i+ 1
i.e.,
n -l
| | CLi < Otjj—i + 2
i = t+ l
Tt-I
| f i+i| < a n _! + 2 ai
j=i+1
R em arks
convergence as $, i.e.,
m ax | z 0 | = m a x | $ 0 1 = ^ a,-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 78 -
$ = ta n -1 y 0/ x 0 is to be c o m p u te d , th e n th e re s u lt will b e if
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 79 -
rea d ily p e rfo rm e d w ith th e sam e CORDIC block how ever, c le a rly a large
application.
-v -’ _ 1 "*■ 7 t 2 i 0 „ fz.ii')
1+1 0 l + jiZ ~ Fi (4 1 1 )
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 80 -
to
CO O CO
O rH rH rH
211 2H •
A
rH
4*
CO
o *
e* CO -X
• ■— o r*-.
A
to
• • CD
X rH rH rH • O
ea ill ZU • c
E 4T. <D
A 2
rH cr
CM 07
rH CO
A c
o o
•r—
4-5
A to
•r* • • CO 4-3
U. • • rH O
1 • CZ
CM A • A
n rH to
+ a »
<o • •r— S-
• CO 07
rS
a a S- "O
• CD r—
•r • • CD O
u . A CD >
ts> • • 4->
C
07 A A ••
O ID
c CD
a; A A A «C c3-
2 ^r CO 4-3
cr CD
CD #» A * 4-5 S.
CO CO CO CM «3 2
CD CD
4-5 A A * Q. •r—
«*- CM CM rH CD Ll_
•r* S-
.E A A a
CO rH rH o A
rH
ii
E
s-
o
* 4-
£ tH o rH *
I
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 81 -
cycle to m ove ||Xi+ili tow ards unity, while 7 * = 0 during a cycle in which no
scaling is to be p erfo rm ed .
(4.12)
a t e a c h ite ra tio n .
req u ired . This n o nuniform ity of e x ecu tio n sp eed s is quite a nu isan ce in a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 82 -
ad ditional hard w are. By su b stitu tin g (4.3) fo r X i+1 in (4.11) and dropping
E xpanding o u t fo r th e v e c to r co m p o n en ts yields:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- S3 -
+ /-
+ /-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 84 -
b e e n em ployed).
of th e trig o n o m e tric functions (i.e., m = 1) since a d esirab le, finite dom ain,
co n vergence. Now
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 85 -
sini? if Q m o d r = 0
cosi? if Q m o d r = 1
siniS = s in ( $ ^ - + R ) =
—sini? if Q m od r = 2
. — cosR if Q m od r = 3
These p rc ro ta tio n s cire relatively sim ple, req u irin g only a m ag n itu d e
ag a in in c u rre d .
For exam ple, W alther’s CORDIC p ro c e sso r, w hich com bines his two
a n d norm alization!
only m o d e s t s p e e d overhead.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 86 -
DATA
con D ie p r e s c a l e . t r a n s f e r s
EXE- NORMAL- FROM
CCTIOX IZE, MISC. COMPUTER TOT.AL
ROUTINE i<sec jiscc pscc jjscc
LOAD 0 5 25 30
STORE 0 0 15 15
ADI) 0 15 25 40
SUBTRACT 0 25 25 50
m u l t ip l y 60 15 25 100
DIVIDE 60 15 25 10C
SIN 70 65 5 160
COS 70 85 5 100
TAN 130 85 5 220
ATAN 70 15 5 90
S1NH 70 55 5 130
COSH 70 55 5 130
TAN 11 130 55 5 100
ATANH 70 45 5 12U
EXPONENTIAL 70 55 5 130
LOGARITHM 70 45 5 120
SQUAItE- 70 25 5 *00
KOOT
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 87 -
Claim 4.1:
Proof:
m _= .1.:
Ki = n VTTs?
<=0
7*= 1
^ 2j ^ (Jen sen ’s Inequality)
i=0
< lim ^ 4 ^ = 4 /3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 88 -
Claim 4.2:
Proof:
n~l
m ax | $0 I = « n -i + 2 aj (from E quation 4.10)
3=0
= 2.785 < 77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 89 -
L em m a 4.2:
$Q . is c o n s tru c te d by re p e a tin g th e l tk e le m e n t of su ch th a t
Proof:
It is n e c c e s s a ry to prove th at:
N otice th a t:
cti i < I
K; = (4.15)
Oti_! i >I
ap p ro x im atio n an - 2 « 2a n-i-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 90 -
i >1:
n=l n=l
a 'i ~ 2 a '} = ai- 1“ 2
j'=i+l j= i+ l
n-2
= CXi- j - 2 a3
}= i
T t-l i n-1
a'i ~ 2 a 'j = ai - 2 ai + 2 «J- 1
3=1+1 j"=i+l 2=i+l
t n-2
= - 2 aj + 2 ai
j=i+l j=£
= at - eXj — at
3=i+1
T h eo rem 4.2:
Proof:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
This th e o re m is very pow erful b e c au se it g u a ra n te e s th a t th e m u ltitu d e
of se q u en c e s g e n e ra te d in th e p a rtic u la r m a n n e r su g g ested , do in fa c t
L em m a 4.3:
Proof:
I t is n e c c e s s a rv to prove:
CLi 2j j ^ » (4.15)
j=i+l
i > I:
j=i+l i= i+ i
^ &jj-1 — & 71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 92 -
i<Z:
n I n
£Xi — ^ &j ~ 2 — 2 -1
j'= i+ l j = i+ l i= i+ l
£ 7 1 -1
= ai - £ *i ~ H ai
j=i+1 J=£
= « i ~ 1 2 a ; “ «£
j=i+i
T h e o rem 4.3:
P roof:
m o re efficient m ethod.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 93 -
{Gil = I 0, 1, 1, 2. 2. 2. 3. 3. 3, 4, 4, 5, 6, 7. 8. 9 I
This y ield s K x = 1.99 and m ax |$„ | = 172.2° c o m p a red with K\ = 1.67 and
l(kl = u . 1. 1. 1. 2. 2, 2, 3, 3, 4, 4, 6, 7, 8, 9, 10 I
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 94 -
th e o re m 4.2 to c o n s tru c t
= I - 1 , C, 1, 2, 2, 2, 3, 3. 3, 3, 4, 5. 6, 7, S. 9]
th e a c c u ra c y of th e scale fac to r.
\Fi] m u s t be m e t, sin ce a'n■= On-y In exam ple 4.2, such a c o n stru c tio n
ite ra tio n s. This c o m p a re s favourably with th e som e eleven scaling ite ra tio n s
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 95 -
th e alg o rith m even converges for larg e angles. The se q u en c e s of Figure 4.4
significant savings.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
oo
o
V ecto rin g -.2 5 0 0 .250 0 3.101 .250 3.1399 .0017
Rotation .20 -.1 .61 .221 .0326 0 .222 .0320 .0013 .0013
m o re efficient.
[AMLA81].
Remark:
d e te rm in e d solely by th e s e few ite ra tio n s, and its rem oval involves a division
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 98 -
ite ra tiv e CORDIC alg o rith m s. This sectio n will explore th e com b in atio n of
re s o lu tio n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 99 -
CORDIC ite r a tio n re q u ire s tim e Tc- The e x ecu tio n tim e, E , and sto ra g e
CORDIC only:
E c = riTc Sc = n locations
M ultiplier only:
Hybrid:
Then:
+1 TC Sn fc Ec TL
- 1 + 4 TU wM e W = ^ n - k + l + 4 T M/ T c
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 100 -
n-k
T “(bits)
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 101 -
Example 4.3:
L et n = 24, A: = 15 a n d —— = 2. Then:
—— = 51277 and
Eu
Rem arks:
1} The choice of ' k ' c a n be optim ized for th e d e sire d com bination of
a p p lied to o th e r c o o rd in a te sy stem s.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 102 -
only th e p lan e ro ta tio n case will be d iscu ssed in detail, how ever th e
n —k ) .
C onsider th e v e c to r ro tatio n :
COS 2 —S£T. 2
Xn = sin 2 cos 2
xn (—l) fc (p2*
c o s <" = s , < M ~ -
significant, i.e.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 103 -
S tep 1: C om pute 'm ' CORDIC ite ra tio n s to o b tain X m w ith a sm all
S te p 2:
1 ~<Pm
= »» i *
Eh _ m Tc 1_
0 < m. < n
Eh 4 Tm 2
71
0 < m <7i
Eh 772. + 2 T ji/ T c
associative. Indeed, failu re to reco g n ize th is fa c t would serio u sly com prise
th a t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 104 -
G(Wc.=oJ
/x —
ftr . / J = : 0 \ /
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 105 -
i.e.,
som e of th e new er schem es, e.g., [F a 8 l], w hich em ploy tru n c a te d pow er
E xam ple 4 .4
s to ra g e is req u ired .
R em ark :
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 106 -
Elif 77L 4 2
—— = — ------ —— -which r e p re s e n ts a n ad ditional p e rfo rm a n c e
Hu 4 la
im provem ent.
m u ltip lie r o r sim ply em ploy T heorem 4.3 to im prove th e reso lu tio n
will d e p e n d on th e application.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 107 -
b it p a ra lle l v e rsu s b it se ria l rea liz a tio n s a re quite larg e (som e d e ta ils a p p e a r
in c h a p te r six).
X = Mz 6 c*+e (4.17)
w here
AT is a floating po in t n u m b e r
cz is a n in te g e r c h a ra c te ris tic
k is th e c h a ra c te ris tic b a se
a:i+1 = X i+ m S iy , (4.18a)
Vi+i = Vi ~ (4.18b)
Zi+i= -O i (4.18c)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 108
—F
6i = p i fo r som e p. Then, th e CORDIC equations m ay be w ritten:
of th e g en eric form :
b =+ e p -F
Case 1: p = 2 , 6=2*
(c — ft-) + g
Xi+1 = Xi + rnMy. b Vi k (4.20a)
Fi
Vi+i = Vi ~ MXib { Xi (4.20b)
F-
by S u b tra c tin g a c o n s ta n t fro m th e c h a ra c te ris tic is a n e asy task , so
F-
p oin t ad d er. W ell... n o t quite. N otice th a t ■£~is n o t always a n in te g e r. L et
F-
qi = in te g e r = Fi m o d k . Then Equation 4 m ay be w ritten :
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 109 -
__ (c_ - ?t ) + e
Vi+i = Vi ~ (2 % ) b 1 (4.21b)
Rem arks:
s h ifte r for a parallel rea liz a tio n (quite a form idable task ), while th e
Case 2: p = b
(4.20) to get:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 110 -
(=„. - Fi ) + e
z i+1 = Xi + m U y<h 1 (4.22a)
(c - )+e
Vn-i = V i ~ * (4.22b)
This case re su lts in a tru ly sim ple im p le m e n ta tio n since yl+1 (o r Z£+1) is
for
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- Ill -
e n c o u n te re d .
details.
n o t of m a jo r concern.
Q = N /D
d e n o m in a to r so a fte r K -iterations:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The sequence $/Z*J is c h o s e n so t h a t th e d e n o m in a to r converges to unity.
re su lt.
F (x 0 ,y0) = z 0
to y = y u = F { x a,y a) = z0.
dim ensional cube w hich h a s P0 = (x0 ,y0 ,z0) as one v e rte x . The invariant
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 113 -
p a ir of functions, tp and ip s u c h th a t
»
= V (x k.Vk)
Vk+i = fiZk-Vk)
Then c le a rly
is invariant.
algorithm ).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 114 -
L o g arith m :
F { x ,y ) — y + In x
^Jfc+l — z k ak
T ransf o rm a tio n
Vk+1 = Vk - l n a i
R atio Algorithm :
f (x ) = w / x for l / 2 s = x < l
F { x ,y ) = i / / z
/
T ra n sfo rm a tio n >„ _
I V jt+ i = V k ^ k
f (x ) = w / V x fo r l / 4 < z < i
F{x >y) ~ y / ^ x
**+i = x k<*k
T ransf o rm atio n
?/jfc+l - y k ak
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 115 -
exam ple.
r e f e r r e d to as th e s to re d ta b le ap p ro a c h ). However, th e a m o u n t of sto ra g e
p o in t m u ltip lie r, w ith m ultiply tim e Ta , is available for scaling by a*. (since
lookup is:
1 / x 0.
S te p 2: C alculate
X i = XgOo
V i = v 5o=
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 116 -
achieved.
a tta in a b le in ste p 4.
jV t ! = n u m b er of b its in n u m b e r re p re s e n ta tio n
Then
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 117 -
^ r - = 2 Q~N (4.23)
Of
NTr
Ec - ~ z — (since N /4 ite ra tio n s ) (4.24)
which are:
h = x 0 - 2-((?+2>
J 2 = x 0 + 2 - « t8>
a,,1 = ------------------
^ x 0 - z-w+Q
2_ 1
a° ~ x 0 +
Hence:
1_ _____
x ° a° ~ x 0 - 2 _w+2)
1
- ^ 2-C9«)
1 --------------
*0
< i _ 2~(G+2) ^o r x ° e 2, 1)
n i + 2~(9+2)
cLTid.
x 0 a 2 < 1 - 2_(E?+2)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- '118 -
T herefore, s te p 4 m u s t a c co u n t fo r th e re m a in in g N — Q b its of
N - Q
p recision, req u irin g a n e x p e c te d — CCM ite ra tio n s .
ta b ie a p p ro a c h e s a re c o m p a re d th ro u g h th e ratio :
Ej Tu
= 4 + {N (4.25)
two sch em es. M em ory a c c e ss tim e has b e e n ig n o red in th is sim ple analysis.
JSp S ff
access. N otice t h a t while depends lin e a rly on N — Q, —— exhibits an
hr Ex
CORDIC a lg o rith m s .
p re s e n te d in S e c tio n 4.3.2.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 119 -
to s to re th e v alu es of In ak .
**+i = P fa b .V*) = ** - l n t i f c
= if {xk'Vk) = Vk a-k
ALGORITHMS
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 12C -
To term ination
m < m• algorithm
Scratchpad
Captions
Memory
m < m
lX): Contents
o f register X
Shifter
C(m): Contents C(m) * - In H + 2~m)
o f memory
location m
x 01 y
C (m)
ADDER
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 121 -
given by:
/ (x ) = -wex
F ( x ,y ) = y e s
x k+l ~ xk ~ ^ - ak
. Vk +1 = V k°k
x =
and
I I JOfc
Q* = |Ofc|e
Then
F ty -V ) - y e 5*
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 122 -
V t+ 1 = Vk^k = V k \ O k \ e 3 *k
so th e tra n s fo rm a tio n p a ir is th e re fo re
~ ext + jf In 10 * |
(4.27)
y<fc+1 = % l“fc|e7“fc = y k ak
since:
= F fa .y k )
ite ra tio n s . L et
a rb itra ry , le t
Ok = 1 4 j 6 k (4.29)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 123 -
1 -<5*
X* (4.30)
fi* 1
where
6k = ta n (Xk
z k = Re (•#*)
so th a t
z k + 1 = Re(-djfc) - ak
= z k - ak (4.31)
_ Elnlai.1
Vw = e *
n i° * i w e
*=0
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 124 -
becom es:
ak = ta n 16ifc (4.33)
CORDIC algorithm .
Remarks:
abcissa.
th e a n g u la r conversion
At = (4.34)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 125 -
1 - t a n h fj.k
Nk+i (4.36)
ta n h /.i k 1
1 S k
V: + l - (4.37)
<5* 1
Jfc=0
o-l
= V l + ta n zak by definition
k=C
- fjb=C
i (Xfg
w hen (ik = j a k
a—i 1
= n — -—
it=o co sh (ijf
O-l
= n V l - tan h 2//*
*=o
= f i 61 = K . x (4.3B)
ik=0
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 126 -
Vk = X i
Re(tfjt) = z k
I m ^ ) = -InAi
co rre sp o n d s to a v e c to r ro tation.
tra n sfo rm a tio n s b u t n o t vice versa. Note th a t th e CORDIC alg o rith m yiei ’. j
re la x th e in v arian ce re q u ire m e n t on F t y ,y ) .
z0 = f (x)\x=Xg
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 127 -
th at:
F o r exam ple
g e n e ra l m a n n e r, sim ple v a ria tio n s of F, lik e th e exam ples above, will prove
quite useful in p ra c tic e . Sim ple v a ria tio n s will n o t im pose u ndue h a rd sh ip
will now b e shown t h a t v a ria tio n (b) in ite m s (2) a n d (3) allows an e x tre m e ly
f (x ) = A (x) Q
CS X —7TL-si X
.si x cs x .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 128 -
in which:
4.1
co o rd in ate system .
Define:
F (x ,Y ) = A (x )Y
x k+\ - Xk ~ a k
1 —721
Yjt
6k 1
FCxjb+j.Yjt+i) = ck F(xjfc,Yfc)
with
Cjfc = l / c s CCjfc
since
1 —m.-5k
= A (xk - a k )
<51 1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 129 -
CS Ok —771 SI at.
■
cs a*- A fa-otfc) s i Ok c s ak
1 -F fot.Y *)
c s ak
F (2 d'Y u) = IY U = Y a = Tf
fc*C
R em ark :
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 130 -
m eth o d which:
in v arian t.
r e s tric tio n on F is rela x ed . F u rth e r exam ples will now be given, in w hich F
previous sectio n .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 131 -
Define
Y = (£ , 77) (4.39)
and
F(a.*t,Yjfc) = t ~— (4.40)
SkV t
cu m b erso m e n o ta tio n is ig n o re d h e re .)
**+i = ** 0 * 6 * (4.41a)
Ifc+i = £*&k (4.41b)
Vk+1 = V kak (4.41c)
(4.41c) th a t
!if * = •/o
*=c
“ri‘ * = jso-
*=0
SO
u-1 a-l X0
*« = *0*=0
n n bi
1=0
= ttt
So Vo
is th e d e sire d re su lt.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 132 -
an d adds.
evaluating
x 0 + In %07}o
b y defining th e functio n
F (xk , Y k ) = xk + In f* Vk
leave F invariant.
so
In ak — In bk
k=C
= x 0 + In £0 7]0
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 133 -
a sm all ta b le of In ( l+ 2 -m ) is m aintained.
Ftafc.Yfc) = x k + I n k /? ? *
$/fc+l = ak
x k +i — x k In otju
Vk-r i — Vk — Vo
CJ—1
11°*= V o /io
k=o
and
Exam ple 4.7, allows fo r s im p le r tra n sfo rm a tio n rules. However th e p enalty
sequence.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 134 -
F(X*,Y*) = Yt- ‘ Xt
tra n s fo rm a tio n s :
Xfc+i - C k X k (4.43a)
n 1. c* = Y r 1
Jk=0
c onvergence division.
sim ple fo rm in o rd e r to m ake th is schem e p rac tic a l. A sim ple 2x2 m a trix
n o t th e b e st).
L et Y„ = y u V iz and le t C* = I + C* w ith C* = 0 0
Vzi Vzz 0 2~m
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Apply th e se q u en c e of su c h C* tra n s fo rm a tio n , say 'n' in n u m b er, to
o b ta in
Y = 3/n V iz
n y'zi y'zz
fi n
N ext apply C„ = ^ to o b tain
_ y 11 v i z
y - [y u y 12
Y" " 1 " 0 y-22
a n d apply
rU"-2 -~ f1 _1'
[0 1 .
to diagonalize Y u.
c o n sta n ts.
a n d th e tra n s fo rm a tio n s
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 136 -
n 1. c * = a^1 .
fc=0
bu = Cfcb0
fc=0
= A0- 1b 0
= x
U = T f - CfcA0
t=c
an d (4.44b) b eco m es
b u = U A " 1^
= Lb0
Fu = 0 = U x — L b0 = U x - bu
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 137 -
u n re la te d fu n ctio n s c a n be studied.
circu itry .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 138 -
a n y ad d itio n a l s to r a g e .
s h ifte r was e lim in ated . It was also show n t h a t floating point calculations
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 139 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 140 -
APPENDIX
n
« = £ = -* 0 (A.l)
i=0
x t. - K [ x o cos(V7rT a ) 3 V m sin (V m a) ]
— /0
yn - K[ y 0 c o s (V m a ) + x 0^fm. sin (V m a) ]
(A.1) yields:
x n = K [ x g co s(—VnTzo) - y 0>fm. s in ( - V m z 0) ]
= K[ x 0 co s(V rn z0) + t/0 V m sin (V m z 0 ) ]
y n = K[ y 0 c o s + 2 ,% /m s in ( - V m z 0) ]
= K[ y g c o s(V m 2 0) - x 0\ f m sin (V m z 0) ]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 141 -
BIBLIOGRAPHY
R e fe re n c e M anual, 1979.
[AMLABl] H.M. Ahmed. M. Morf, D.T.L. Lee a n d P.H. Ang. "A VLSI S peech
F orm s," Proc. 1981 ICASSP, A tlanta, GA, Mar.-Apr. 1981, pp. 648-
653.
[BellBl] B ell S ystem . Technical Journal, Vol. 60, No. 7, p a r t 2, S ep tem b er,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 142 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 143 -
[W a7l] J.S. W alther, "A Unified A lgorithm for E lem entary F u nctions,"
385.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 144 -
CHAPTER FIVE
alg o rith m s w hich a re com m only u sed to p e rfo rm m a trix o p eratio n s like
r e p o r te d in th e lite r a tu r e to d a te .
VLSI s tr u c tu r e s m u st be re g u la r in n a tu r e in o r d e r to m anage th e la rg e
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 145 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 146 -
row s a p p e a rs to exist.
allowing th e use of a lin e a r a rra y of CORDIC p ro ce sso rs. This is a n exam ple
T = [tj : t 8 ; • • • : t B] (5.1)
0 0 0 . . . . 0
10 .
0
z = = colum n s h if t m atrix (5.2)
Lo . . . . o i o .
e = [i j o ; • • • ; o y (5.2b)
tk l = Ith elem en t of c o lu m n t*
Initialization:
[ c i : c 2]° = [ t i | t j - f n e j / V f n (5.3)
R ecursion:
[c i : c 2]*+1 = [Z c f i c |] 0 fc , (5.4)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 147 -
w here
cosh-i?* siELhtfjfc
©* =
sin h cosh i?fc
= ta n h -1 -~ C.3^ -2 .
e lm
L = [c? | c f | • • • I e f -1] •
its rows as shown in F igure 5.1. The arrow s in d ic a te which ro ta tio n angle is
Initialization:
Ci = 0 , V k
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 148 -
R ecursion:
fo r fc = 0 to £-1 begin:
= 0>
end;
In th is algorithm , th e re c u rs io n on ’k ’ is th e o rd e r u p d a te of th e
a p plication.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 151 -
Ax = b (5.6a)
w here,
b: is a known v ecto r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 152 -
a pplied to A.
m atrix , Q ^. su c h th a t th e ( i ,r ) e le m e n t of A is a nnihilated.
1 0
0 1
cos'iSjj.
1
Qr i = (5.7)
1
s im J jr
0 0
r th col col
a = n n -Q 5 * = qk - (5.8)
r i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 153 -
of
^1 ^r2 ■ ■ • a m
° il ° i2 • • •
fo r t = l to - begin:
fo r i = r + 1 to n begin:
fo r j = r + 1 to 71 begin;
end;
^1
bi
<- c o s - i - s i m 5 rt
sin-^ rt c o st >_-
end;
end;
below a^..
plane ro ta tio n w here G aussian elim ination calls for only one m u ltip lic a tio n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 154 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 149 -
Factors
n-2
-V n -l. Z
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 150 -
D a ta M a n a g e r
----------- A--------------
OLU
d a t a to
le ft
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
- 155 -
“ 55
“54 a45
53 a44 a 35
a21 a12
ro c 3
13
14
Lb*
24
a
2b
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 156 -
sta c k s). This is rea d ily se e n from F igure 5.5 w hich shows th e d a ta inputs
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 157 -
PROCESSOR 1 2 3 4 5 6
* y x y x y * y x y x y
2 a11 33i a 12 a 22
✓
/
3 a l1 a4i ,a 12 332 a 13 323
' /
r=1 4 a ti 35/
,a 12 a42 3)3 a33 aI4 a24
*
r= 2 5 322 a32 val2 a52 a13 343 a14 a34 a15 a25
j
6 322 342 ,a23 333' a i3 353 314 a44 al5 335 j>2
'— ~V */ /
y
CM
» 352
322 a23 a43 324 334 ’ , 314 354
II
a15 a45 A 63
----------- V ' ----------V ' /
/
r= 3 8 3^3 343 a23 3s3 324 "35 a25 a35' a 1S a55<r /& ) *4
\ _____ ' ------/ - x
bz H
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 158 -
i t h row (for i = r + 1 , r + 2 n) a re p ro p a g a te d le ft a n d th e n in to th e
alg o rith m s in tim e, th e lin e a r a rra y enjoys one sp a tia l d im e n sio n as well. It
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.3.1 A ctivity Charts
tem p o ral and two sp a tia l dim ensions. The lin e a r a rr a y h as exploited only a
a rray .
lin e a r a rr a y (in e ffect "tw o " te m p o ra l dim ensions) to a singly indexed one.
V.
2-D a rra y w ith well defined o peration, by stack in g a s e t of one dim ensional
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 160 -
r -2 k£*u it
f-3
r-7
r-2
notes- process:*
INPUTS
inactiveprocessor
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 161 -
Eem arks:
c o n s tru c te d using q u ite a g en eral p rin cip le. No p rio r 2-D s tru c tu re
in creasing ’r ') .
wavefront.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 162 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 163 -
would b e p + n ( n — l ) / 2 units.
c o n n e c te d sy ste m s.
5 .3 .3 D ual A rrays
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 164 -
lin e a r a rr a y b y m apping th e loop into sp ace a n d p erform ing th e 'j ' loop
c o n s tru c t a two dim ensional a rra y . The n o tio n of du al a rra y s also provided a
w ith form alizing th e id ea of dual a rra y s as well as th e use of activ ity c h a rts
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 165 -
1
rt-
A
1
al l
a12 ,.t= 5
a13 t= 6
a14 t= 7
a15 t= 8
a22 t= 9
C1
a23 t=10
“24 t = ll
a25 t=12
a33 c2 t=13
a34 t=14
a35 t=15
c3 t=16
t=17
t=18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 166 -
r-1
l-l
f-7
NOTES: INPUTS
CTVE
OUTPUTS
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 167 -
fO to
bl a i 5 3 14 3 13 3 12 3 i l
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 168 -
th is d iss e rta tio n was to exam ine a class of p ro b le m s, nam ely signal
p rocessing alg o rith m s). F ortuitously, com plex signal p ro ce ssin g ta sk s often
of a m odel.
D efinition. 5.1: A basic loop is one in which th e loop body does n o t contain
loops
h - i'Z In = in -
(iii) Lj is d a ta o u tp u t d ep en d en t on L , d e n o te d Li 6° Lj, if
x € n ( i j ) is c o m p u te d a fte r th a t of Li
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 169 -
a re a sequence of s ta te m e n ts , $ Sj su c h th a t S 1 6 S 2 • • • 6 S k
an d S 2 £ ijj, Sk € L j.
d e p e n d e n c ie s hold:
e x e c u te d befo re S j.
F o r e a c h d ep en d en ce re la tio n b e tw e en Sj an d S j , th e re is a c o rresp o n d in g
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 170 -
do while I lt Cj ;
Bx
do while / 2, C2 ;
Bz
O
Q
Q
do while Is , Cy ;
Bu
e n d Iu ;
e n d Ij i - i :
end I i ;
a n d /o r B u - 1 a re n o n em pty.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 171 -
on a u n ip ro c e s so r should a t le a s t re q u ire no m o re th a n
d e p e n d e n c ie s influence th e achievable s p e e d en h a n ce m en t.
b e c a u se th e s e cure e x e c u te d m o st frequently.
(assum ing e a c h loop body co n sists of only a few sim ple in stru c tio n s, whose
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 172 -
n u m b e r is in d e p e n d e n t of n ) .
(d l): B y (iy ) S iy e Iy , Iu _j = c o n s ta n t, / ^ 0
Proof:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
w here:
fc= 1
Suppose t h a t *) is ex e cu te d a t T = 1. T hen th e th ro u g h p u t
ti = m ax ( i , i + (2d —iyf 1 J)
p ro ce ss is in itia te d a t:
m a x ( k u .- j i , k j / ^ n + —J )
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 174 -
T - { k j i + k j i - 1)71 + (2 d -Z )[ U J/{Z(i-£>oj
fo r th e e n tire p ro g ra m is
= 0 (71* J)
T heorem 5.1.
I n -z - c o n sta n t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 175 -
tim e.
Proof:
W D
#
M-2
• 0
BM(kMn)
0
0 # 9
9 9 9
9 0
V kMn )
+r
(d6).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 176 -
ti = \r-jH 2 d + i +t0
is c le a r th a t
= 0 (n )
t=i
t= i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- Ill -
~ kyn
= 0 (n ) (5.9)
Rem arks:
b y p r e te n d in g th a t e a c h p ro ce sso r h as a m e m o ry a s so c ia te d w ith
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 178 -
is en tirely two dim ensional, which is c le a rly re s tric tiv e . In any event, if one
steps.
Remark:
tim e dim ension) if c u rre n t day ta lk of " th re e dim ensional VLSI" proves
c o rre c t.
local con n ectiv ity is practiced), it is u n fair to a ssig n equal cost (tim e) to
T heorem 5.1 is sim ply a n ex ten sio n of th e " sta c k in g " id e a of S ection 5.3.2
Lem m a 5.2:
The tim e s p a c e dual lin ear a rra y of th e U m loop alw ays exists.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 179 -
G C B ^ a )] «[B m_1 ( 2 ) ]
Proof:
Corollary;
T here e x ist "space du a l" a rra y s in which th e sp a tia l dim ension of two-
S p ace dual a rra y s will n o t b e stu d ied in th is d isse rta tio n , m ainly
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 180 -
p a rtic u la r a rra y .
Rem ark:
Exam ples:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 181 -
Example 5.2:
fo r r = 1 to begin;
fo r i = t +1 to 7i begin;
■>?ri = Q j r / ° r r ; arr*~ ar r •
fo r j = r + 1 to 7i begin;
°T3 1 o' au
1 “y.
end;
&r <—
1 0 br
b i.
tfri 1 bi
I J
end;
end;
A ~ V~ V - W7 Tr
w here
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 182 -
v tj = 0 otherw ise
and
m ;- = if ? > i
a re th e e le m e n ts of th e u p p e r tria n g u la r m a tric e s V an d W.
[DM81]:
fo r r - 2 to n begin;
for i = r —1 to 1 begin;
fo r j = r + l to 7i begin;
end;
end;
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 183 -
t = l t= 2 t= 3 t= 4 t= 5
a l l a L2 a 13 a l4 b l
a 22 a 23 a 24 b 2
a 33 a 34 b 3
a 44 b4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 184 -
s y s te m a tic m e th o d fo r th e ir analysis.
ob tain ed .
of a c o lle ctio n D.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 185 -
D efinition 5.7:
( 2) d 0{P.Q ) = d a(Q.P)
^ \?= ~ ?x + qx ~ r x | + \py - qy + gy - Ty |
= d jtiP .Il)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 186
o,o
2,0
3 ,0
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
- 187 -
0,0 0 ,2 0 ,3
1,0
2,0 2 ,3
3 ,0 3 ,1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 188 -
Definition 5.9:
St m a x [ \px - q x \ + \qx - T x \, Ip y - q y \ + | qy - r y |l
£= m a x [ \px - r x I, |py -T y |]
= dx {P ,R )
D efinition 5.10:
d ff(P .R )
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 189 -
0,0 0, 2 0 ,3
1,0
2,0 2 , 2 '.
3 ,0
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 190 -
su c h th a t:
d a{P.Q) * r P .g € Cl
For exam ple, F ig u re 5.15 shows th e shape of a clo sed ball in th e collection
b o th th e c e n te r P a n d th e rad iu s r .
T heorem 5.3:
tp = t Q + m in : d(Q, P ) « r V Q}
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
191 -
o
ooo
s~~\
K J
O0 ©O o o
O0 o o o
ooo
o
R ectangular Array
oo oo
ooo oo
oo o P i
'•v_y oo
o o o ®o oo
ooo o PN
o
oo o \ ___ y t
o o » }
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 192 -
Proof:
B ut th is is false b y th e definition of V .
p ro ce sso rs.
a rra y p ro c e e d as follows;
given
d ir e c t p a th ) b u t r a th e r it is d e t e r m in ed b y th e d a ta d ep en d en cies
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 193 -
M-2
3 ,0 3,k..n
►»
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 194 -
(d7) (d2)
c^cx T < f
I M -2< q “M-l
P a r t ia l O r d e r in g s :
f o r BM - 2 ^ X 1- q
BM - 2 ^ BM - 2 ^ BM-2^i _ 1 ^ BM - 2 '
fo r bm_ 9 ( 1 )>
for y y
BM - 1 ^ BM - 1 ^ BM - l ( i >
for
• •
f o r BM( i ) c «
* _BM - 1 ^ ) bm ( D V 2) V 1')
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
L = S
3=1
1
+ d s [ B n { k ji-in - j l,d ) \ B u -i{k ji-in - —j l , *)] )!j
k u .n —l
*jf-in-f li
3=2
d R [ B j i - i ( l - j B f f - x{ l - j - l , * ) ] - 1 0 <,?' < Z- 2
L:
+ 1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 195 -
= 0{n)
This ex am p le will d e m o n stra te how "quick and. d irty " approxim ations
follows:
t h a t a rr a y e a rlie r.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 197 -
2 .n
n -l,n -l • *
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 198 -
A is in trid ia g o n a l form .
section.
S te p 2: F o rm A*+i = R k Q j an d r e p e a t th e p ro c e d u re .
Q A = Qn Qn-i • • Qi A = R
(notice th a t a tra n s p o s itio n of Q , as in s te p 1,is n o t re q u ire d sin ce it is u se d
on th e le ft h a n d sid e of th e e q u a tio n h e re )
R Q = R Qn Qn—i ■ ■ ■ Qi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 199 -
(RQ)r = QfQ2r • • • Q j R r
This l a t te r form, is quits sim ple to im p lem e n t since a p re m u ltip lic a tio n b y
p ro c e sso rs.
ttjl tljg 0 0
a Sl a 22 a 23 0
A = 0 a 32 a 33 a 34
0 0
is re d u c e d to:
r ll r 12 T 13 0 rn 0 0 0
0 7-33 7-24 r l2 7-32 0 0
R = 0 0 7-33 7-34
and R7 =
r 13 r 23 r 33 0
0 0 0 r 44 0 r 24 r S4 r 44
p lace.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 200 -
an d th e p ro c e s s o r o u tp u ts m ay b e fed b a c k to co m p u te th e se co n d ste p of
Figure 5.20 fo r ste p 2 an d finally Figure 5.21 for th e com b in atio n of ste p 2
req u ired .
rea d ily im p lem e n te d on a lin e a r a rra y of CORDIC p ro c e sso rs. It was no ted
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 201 -
'al l a21
t= i le x
a 12 a22
t=2
r 12 a22V x
\
\
\
\
. a 22 a32
0v 523
t=3
r 13 a 23x r22 0
\
\
\
\
^ a23 a33
t= 4 £=!)
’ 23 “ 33
\
\
\
0 a,'34
a33_ / a43
t= 5
r 24 a3 ^ r 33 0
\
\
\
^ \34 , a44
t=6
in a c tiv e . «. r 34 r 44
in p u ts
^ ^ ^ C O R D I C p ro c e sso r
o u tp u ts
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 202
rl l r12
t= l (-6
P11 r 12\
\
\
\
0 r ,„
^ - / 13
t=5
X r 33 r 34
P33 p43
u r
44
t=6
& Y z<
p34 p.44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 203 -
Processor 1 Processor 2 Processor 3
*11 *21
rn °
12 22
12 *22^- „
QR Decomposition
r
M
13 23 N .
H
/2 2 °
x /
~\ v ' /
rl l 12 w f V ..
x ^ t-*
CRq)t - p
P12 r2 2 \
next QR deconp.
rn °
22 *23 / X PI3
p22 r2 3 ^
/ X
r 33/ X \ r23 r 24
\ l 3 34
‘ rV
“'») f 9
*34 *41
\,S-
‘ p34 p«
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 204 -
im p lem e n ta tio n .
[CB79 for exam ple]. N e a rest n eighbour c o n n e ctio n s only was an im plicit
th e n e a re s t neig h b o u r?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BUSLIOGRAFHY
Wesley, 1980.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 206 -
1977.
[Ku79] H.T. Kung, "L et’s Design A lgorithm s for VLSI System s," Proc. o f
[KuSSO] S.Y. Kung, 'VLSI A rray P ro c e sso r for Signal P rocessing," Proc. o f
J a n u a ry 28-30, 1980.
ICASSP, p p 742-743
[MD81] M. Morf, J.-M. Delosme, "M atrix D ecom positions an d Inversions Via
a n d S ig n a l Processing,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 207 -
1S6B
[SK75] A.H. Sam eh, D. Kuck, "L inear S y stem Solvers fo r Parallel
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 208 -
CHAPTER SIX
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 209 -
CORDIC algorithm s.
p ro d u c t 'AN1 w ith s tra ig h t forw ard m u ltip lic a tio n (also using a CORDIC
E ach tim e slo t in Figure 6.1 c o rre sp o n d s to one com plete CORDIC
CORDIC algorithm .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CORDICBLOCK1
m»-1 -I
y -0 z-0 y-0 p+
BY- BY0„.
= t a n -1 —
ir
J*p = ta n h -1 p+
c o m p u ta tio n .
F irst, it is possible to design a " p e rfe c t” CORDIC, i.e., one which does n o t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 212 -
form s, a re conducive to th a t.
applications o th e r th a n s p e e c h analysis.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 213 -
CORDIC BLOCK 1
INPUT SCRATCHPAD
PORT REGISTERS PORT
CORDIC BLOCK 2
MICROPROGRAM CONTROLLER
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 214 -
applicable.
is d e fe rre d to a l a t e r section.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 215 -
b it p arallel?
handled?
tr u n c a tio n in s ix te e n CORDIC ite ra tio n s [W a7l]. Ite m s (2) and (3) have a
solved.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 216 -
b e stu d ie d n ex t.
r
1 rr.^ d i
-AkOi 1 yi
* t+ i = zi ~ £ Mi “ t
t l : § 1= BUFy «- X ; B U F y «- Y ; B U F y <- Z
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 217 -
< ■ /-
+ /-
A rith m e tic U n it
+ /-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 218 -
b e valid th ro u g h o u t th e m icrccycle).
ad v an tag e. The z -c h a n n e l is p a rtic u la rly sim ple since one of its o perands,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 219 -
th e re a d e r.
co m p lem en t addition).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 220 -
n
•S.. (sum)
i+1
i+1
ADD
ADD
a i bi
WRITE
REFRESH
READ NEGATIVE
READ
Figure 6 . 5 : A R e g i s t e r Cell
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 221 -
th a n th e e x tr a in v erter.
6.3.1.1 P ip elin in g
The individual CORDIC ite ra tio n s 'm ay b e pip elin ed as show n in Figure
m eth o d e sse n tia lly lea d s to a d istrib u te d s c a le r (unlike F igure 6.3) w here a
sm all s h ifte r, su p p o rtin g one or two shift values, is b u ilt for e a c h ite ra tio n of
technology.
re q u ire m e n ts .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 222 -
fZ
o
o
CO
o
»—I
Q
O'
o
o
X
•a
•c
0)
CM
Q.
O
VO
CO U_ J— VO
<D
s-
3
cn
•p*
Lu
co
00 U_ >—
X N
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 223 -
MUX
>-
— >N— -^5
CU
c c o
o> cn e
CO CO
0)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 224 -
$ 2: Y <- AU
$2: Z «- AU
n eed to buffer th e new value, xi+1, u n til T/i+i has b e e n co m puted, since th e
la tte r q u a n tity re q u ire s th e value Xj. This buffer could be e ith e r one of th e
done on of C3. W hereas previously all writing was p e rfo rm e d during <S?2,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 225 -
and u tilize th e e n tire clock cycle to effect th e tra n s fe r. J u s t p rec isely w hich
realization.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 226 -
WRITE
READ ON BUSS1
CORDIC REGISTERS
WRITE '
READ (BUSS1 nnly ) AU BUFFER REGISTER
BARREL SCALER
WRITE REG 0
READ REG 0
WRITE REG 1
READ REG 1 SCRATCHPAD REGISTERS
WRITE REG 7
READ REG 7
F ig u re 6 .8 : Bus S tr u c tu re o f P a r a l le l - S e r i a l A rc h ite c tu re
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 227 -
BUFFER
AU
I-REGISTER/SHIFTER MUX
AU.
BUFFER
:-REGiSTER
AU
a.
SIGNY-
CONTROI.
S1GN/'.■ ■MODE
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 228 -
ite ra tio n cycle, th e c o n tro lle r invokes a p a ra lle l load from th e p rim e d
re g is te rs to th e ir non-prim ed c o u n te rp a rts.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 229 -
by T exas In stru m e n ts Inc. [WB78] a s well as w ith th e Bell L ab o rato ries Echo
b e given h e re .
originally sp eculated.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(P a rtia lly
A R IT H M E T IC Dynamic)
AREA 1.76 1 .4 3 1 .0 0 .7 3 .1
(M M2 )* 16 13 9 6 .3 28 72
( A r i t h m e t i c Area) ( In c lu d e s C n t l )
CONTROL
COMPLEXITY 1 -2 1 1 >1 1 -2 B IG
(Synthesis
Only)
R E L A T IV E
THROUGHPUT 20 6.67 1 0 .3 3 1 2 .5 1 2 6 (T A P S )/5 '
M I N CLK RATE*
FOR 1 STAGE
0.6 MHZ 1.9 MHZ 12.7 MHz 38 MHz * ,8 MHz ■v 2 MHZ.
(8 KHZSPEECH
A N A L Y S IS )
(Random
MICROPROGRAM
-vl -vl ^1 >1 Logic)
COST *1
R E L A T IV E
1 .2 1 1 <1 »1 » 1
DESIGN
EFFORT
th e echo canceller.
clock r a t e req u ired by th e se ria l-p ara lle l approach. This is quite
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 232 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 233 -
utilized.
The rem a in in g fig u res in th e tab le a re q u ite subjective. F o r exam ple, all
larg e, difficult to a lte r and also tim e c o n sum ing to design. The
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 234 -
c o m m e rc ial chips.
in d eed th e case.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 235 -
a n d SSUB in stru c tio n s. "While th e l a tte r a re said to req u ire one m icro cy cle,
m em o ry provides th e n e c c e s s a ry m ic ro co d e to e a c h p ro ce sso r. S e p a ra te
n=255.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 236 -
Sign
In s tr u c t io n Operands Reversal O peration Comments
Mnemonic B it(c )
---------------- i
MOVE1 s r c .d e s t no d a ta tr a n s f e r s r c . d e s t a re X.Y.Z o r
sc ra tc h p a d re g s o r I/O
ATAN2 y es Z+etan_1Y/X
ATANH2 y es Z+etanh_1Y/X
PORT 1
2 - PORT
PROGRAM MEM
PORT 2
CORDIC 1 CORDIC 2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 237 -
m o re g en eral), m any signal processing alg o rith m s fit th e c o n tro lle r well, fo r
MOVE , Y ; i/, Y ~
MOVE 0 . Z ;0 . Z
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 238 -
ATANH ; ATANH \T = 1
MOVE X,Rl;v,Y
MOVE p .Z;0 ,Z
MOVE 77 , X ; Z , R2
MOVE R2 , Z ; 7 7 , X
MOVE NOP ; M , Z
E-i
CO
CROT ; MUL
II
MOVE X , R3 ; R1 , X
MOVE Y , R4 ; R2 . Z
MOVE R l , X ; NOP
ATANH ; CROT ;T = 4
MOVE R3 , X ; Y , X
MOVE R3 , X ; Y , X
MOVE Z ,R3 ; 0 , Y
MOVE NOP ; R3 , Z
s-3
01
JROT ; JROT
II
This se ctio n will explore th e 'use of th e ch ip for com puting d isc re te F o u rier
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 239 -
w o rth noting t h a t a single CORDIC v e rsio n of th is chip w ith a sim p ler c o n tro l
p r e s e n t chip a n d c o n tro lle r apply d ire c tly to th e rea liz a tio n of th e single
CORDIC version.
tra n s fo r m c a lc u la tio n is com plete. The final two m icrocycles im pose v ery
th e r o ta tio n of x (n ).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CORDIC BLOCK k
LIN CIRC
x!(kj)
conm c olock a * i
LIN CIRC
m«1
/ —0
/-I
NOTES »'„>-.-inM/n SUBSCRIPT fl (I) Di NOTES REAL (IMAGINARY) PART x’(k,l)-=xllf-B^kl X(k.l) = E x'(k.i)
tran sfo rm .
stage.
reflectio n coefficient. pn .
C h ap ter Two:
£cj - T c.r - V t
7 n - l.T
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 242 -
T =1 T =2. T =3
1
1_ \^ P Z ATAN
1 1 MOVE P< Y
1 1 MOVE 0, Z
P_ P__ V___ V+ /•JAN
ATAN 1 ATAN 1 CROT
I MOVE v, X
0 0 9P MOVE V* Y
tt l CROT
MOVE X, V
cos Bp -s in Bp MOVE Y, V
sin 8 COS Be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 243 -
1ST _ ^7 1 + 1 .7
X n +l.T -
/n -l.r+ l
7 n+ l.r = 7n.T ~
fin+l.T
w here
( n + l ) th fille r sta g e
p re d ic tio n e r r o r s an d
F igure 4.3 have also b e e n em ployed. The e a se w ith w hich th e sig n rev e rsal is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 244 -
I 33 E
s.
o
u.
s.
0
"O
T3
as
*D
0
N
0
.C
t—
ID
0
S-
3
cn
Ll.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 245 -
case).
z ifc — 2 c n r if c - n ( 6 .1 )
n
c *+1 = e* - Ae„
w here
A is a re a l a d a p ta tio n c o n sta n t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 246 -
T=1 T=2
Im ( c ;)
ATAN CROT MUL
R e(e„)
MUL MUL
Irr. ic’ )
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 2*7 -
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 248 -
APPENDIX
DD
R1=10K
AU d elay
R3=14.6K
A /W
C^=.3pF — —
- i — C2=.3pF
C3= l . i pF
V,
DD
R2=20K
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 249 -
ing a 5V supply), i t is c le a r th a t:
Tc ^ R \C i + R zC z + Ta'j
- 81 n s
V,
DD
T
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 250 -
a re a s p ro v id ed in figure 6 .1 0 , le a d to:
R u - 30 A:n R dl = 3 R dZ = 6 k Q C = 0.1 p F
define:
tp = p a ir d elay = tr + tf
Then:
1 1 am grateful to Professor Hennessy for pointing out these figures and far supplying me
■with th e perform ance of som e adders for comparison.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 251 -
BIBLIOGRAPHY
[AMLABl] H.M. Ahmed, M. Morf, D.T.L. Lee a n d P.H. Ang, "A VLSI Speech
F orm s," Proc. 1981 ICASSP, A tlanta, GA, M ar.-Apr. 19B1, pp. 64B-
653.
[CD80] Y.-S. Chen, D. D uttw eiler, "A 35,000 T ra n sisto r Chip VLSI Echo
[De74] A.M. D espain, "F o u rier T ransform C om puters Using CORDIC Ite ra
tions." IE E E Trans. Com put., Vol. C-23, Oct. 1974, pp. 993-1001.
H ardw are Im p le m e n ta tio n ," IEEE Trans. C om put. , Vol C-28, No.
ley, 1980
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 252 -
365.
109-116.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 253 -
CHAPTER SEVEN
CONCLUSIONS
This d iss e rta tio n was d ev o ted to th e stu d y of signal processing algo
m u la te function.
The m otiv atio n for v e c to r ro ta tio n p rim itives arose from th e rea liz a tio n
tio n s p e r stage, i.e. five p rim itiv e op eratio n s. However, th e com plexity in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 254 -
tan e o u sly enlarging th e reg io n of convergence of th e alg o rith m and com pen
design, red u cin g its cycle tim e re q u ire m e n t by n e a rly 50%. Chip size was
p o in t ad d e r).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 255 -
squared.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 256 -
given.
th e CORDIC ite ra tio n s . By working with a sm all b u t pow erful in s tru c tio n set,
p ro b lem s of in te re s t.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.