Professional Documents
Culture Documents
] [:
;]
algorithm [3], [4]. CORDIC algorithm computes the val-
[ case
ues of sine and cosine based on the property of rotating a
[ ]
-sine
given angle from an initial point around the origin on the = sine case
X-Y plane. Then, the X-coordinate and Y-coordinate
of the resulting point are the sine and cosine of the given
angle, respectively. Schemes based on CORDIC algo-
rithm can be classified into two categories. The first
In (I), if the arbitrary angle 0 is expressed as the linear
one is called the mixed-hybrid CORDIC method, and
combination of the angle where tanek is known in
the second one is called the partitioned-hybrid CORDIC
priori, the resulting point (X, Y) = ( rcos 8 , T sin e) can
method [6]. In the mixed-hybrid CORDIC methods,
be obtained by decomposing the rotation of 0 into a
the real multiplications are required to accomplish the
sequence of subrotations of e k , and (cos 8, sin 0) will be
ArcTangent Radix (ATR) part subrotations, which de-
obtained in turn. That is,
grades the operating speed of the whole stage. In addi-
tional, from the hardware viewpoints, the multipliers in-
duce a large hardware overhead. In the partition-hybrid
CORDIC methods, the multiplications of the ATR part
IV-544
and Accordingly, the rotation in (1) can be partitioned into
two cascaded stages. The first stage is
N
k=O
1
-tanukek
1
1) [ 2] (3) X M = X O- YOtan u M O M
YM = YO+ X Otan U M e M , (9)
where K = cosu&, . . . cos U N e N is a constant, N is the and the second stage is
number of subrotations, and uk E { -1,O, 1) determines
the direction of each subrotation [3]. X = X M - YM tanaL8L
Y = YM + X M tanuLeL.
2.1. The C O R D I C Algorithm Clearly, the second stage computation can be easily real-
From the digital computation viewpoint, CORDIC algo- ized by a sequence of shift-and-add operations in radix-2
rithm takes advantage of the property that decomposes numbering system because BL is small enough so that
a rotation into a sequence of subrotations to develop an the value of t a n 9 ~can be approximated by BL that is
approach that reaches the target angle e by applying a power of 2 [6]. However, the first stage computations
successive subrotations [3], [4]. In CORDIC algorithm, involve multiplying tanBM on the datapath, which be-
a positive angle 8 is represented in an N-bit binary num- comes the bottleneck of this method. These multiplica-
ber as tion operations are usually replaced by a ROM lookup
I table.
N
eM = X e M k =C ( e H k f 6Lk) (11)
k=2 k=2
as
k=2
3 N
= bkak (7)
k=l
k=2
and Or, is the fine subangle that can be expressed as
where BEL is defined as
N
IV-545
Number of bits Number of Address of Address of Number of
N address bits the first word the final word words
16 4 0000 1100 13
17 4 0000 1100 13
18 5 00000 11001 26
19 5 00000 11001 26
20 5 00000 11001 26
21 6 000000 110010 51
22 6 000000 110010 51
23 6 000000 110010 51
24 7 0000000 1100100 101
25 7 0000000 1100100 101
26 7 0000000 1100100 101
I 13-word ROM I
3. A R C H I T E C T U R E
Address 1 X Y
.. .......
IIM)
.
/010110101I01l1l0I001l0
.
1
101 I 1 1111 11101010101010 00000011Ill I 1 I 1 1101010
oIollolm1111M)11M)l
__-.____
i 1
that the requirement of multipliers to generate the prod- Figure 1: The ROM-based architecture for the coarse-
uct terms XOtan OM and YOtan OM where the value of fine rotation method [5].
taneM is itself needed to be computed. Figure 1 shows
a ROM-Based architecture [5]. In this architecturp, $0
in (5) is treated as the initial angle, and the information 3.2. The ROM-less A r c h i t e c t u r e for the Modi-
for the rotation of the coarse angle OM is built in a 13- fied Coarse-Fine Rotation Method
word ROM, each word contains one value of tanOM for
13 possible O M ' s . Obviously, as shown in Table 1, for the In the modified coarse-fine rotation method, since we de-
precision improvement, the number of bits used should composed e into the sum of ~ ; = ~ e Hand k ecL where
increase, the size of ROM will grows exponentially which tant?Hk is a power of 2 and OxL is small enough such
becomes a large overhead from the hardware implemen- that tani3cL can be approximated by OcL, we can re-
tation viewpoint. Figure 2 is a graphic representation alize the whole rotation by an all radix-2 shift-and-add
showing this trend. datapath architecture as shown in Fig. 3. In this fig-
IV-546
Number of X,Y values stored in ROM table or performing the real multiplication operations on
120 I 1 the datapath. The hardware realization can be arranged
5 100
in pipelined architecture t o speed up the operation. The
precision can be increased t o a satisfactory level with-
0
4 80 0
out a large hardware overhead. In the future, by suitably
configuring this design with other functional cores, such
: z i; : o20
as D/A, low-pass filter, we can implement a direct digital
frequency synthesizer.
16 17 18 19 20 21 22 23 24
Number of bits 5. REFERENCES
IV-547