You are on page 1of 4

ARCHITECTURE FOR CORDIC ALGORITHM REALIZATION WITHOUT

ROM LOOKUP TABLES

Chuen-Yau Chen and Wen-Chih Liu

Department of Electronic Engineering, I-Shou University


Kaohsiung, 84008, Taiwan, Republic of China
Email: cychen@ieee.org

ABSTRACT are replaced by a ROM lookup table which contains


This paper proposed architecture for implementing the the pre-computed values of all the possible summations
CORDIC algorithm. By decomposing an arbitrary rota- of the ATR terms as mentioned in the mixed-hybrid
tion angle into a sequence of coarse angles (each of whose CORDIC method. Although the quart-wave symme-
tangent value is a power of 2) and a fine angle which is try properties of the sinelcosine functions have been
small enough, this architecture can be implemented by adopted to compress the size of ROM [2], the size of
performing a sequence of shift-and-add operations in the the ROM lookup table grows exponentially with the in-
radix-2 system without any ROM lookup table or real creasing of the precision of the angle.
multiplication requirement. It is suitable to be designed The work in this paper is motivated by the develop-
in pipelined architecture for performing the high-speed ment of the partitioned-hybrid CORDIC methods and
operations. concerned with the reduction of the size of the ROM
lookup table even removing the ROM lookup table com-
1. INTRODUCTION pletely. In this proposed architecture, all the multiplica-
tions become the radix-2 multiplications which can be
Sine and cosine are the trigonometric functions which easily realized by the shift-and-add operations in the
play the important roles in many fields of applications, radix-2 system.
especially in the control systems and communication sys-
tems. For examples, it might be treated as a carrier to 2. SINE/COSINE COMPUTATIONS
modulate the signal for transmitting on a specific chan-
nel. In the X-Y plane, a point ( X , Y ) = (rcost9,rsinO) can
With regard to the generation or computation of the be viewed as by rotating an angle of 0 counterclockwise
sine and cosine functions, there have been many schemes along the circle of a radius of T centered at the origin
proposed [l]. Most of these schemes are developed on from the point (X0,Yo) = (rcos0,rsinO) = ( ~ ~ 0This ) .
the Coordinate Rotation Digital Computer (CORDIC) operation can be expressed 51s

] [:
;]
algorithm [3], [4]. CORDIC algorithm computes the val-

[ case
ues of sine and cosine based on the property of rotating a
[ ]
-sine
given angle from an initial point around the origin on the = sine case
X-Y plane. Then, the X-coordinate and Y-coordinate
of the resulting point are the sine and cosine of the given
angle, respectively. Schemes based on CORDIC algo-
rithm can be classified into two categories. The first
In (I), if the arbitrary angle 0 is expressed as the linear
one is called the mixed-hybrid CORDIC method, and
combination of the angle where tanek is known in
the second one is called the partitioned-hybrid CORDIC
priori, the resulting point (X, Y) = ( rcos 8 , T sin e) can
method [6]. In the mixed-hybrid CORDIC methods,
be obtained by decomposing the rotation of 0 into a
the real multiplications are required to accomplish the
sequence of subrotations of e k , and (cos 8, sin 0) will be
ArcTangent Radix (ATR) part subrotations, which de-
obtained in turn. That is,
grades the operating speed of the whole stage. In addi-
tional, from the hardware viewpoints, the multipliers in-
duce a large hardware overhead. In the partition-hybrid
CORDIC methods, the multiplications of the ATR part

IV-544
and Accordingly, the rotation in (1) can be partitioned into
two cascaded stages. The first stage is
N

k=O
1
-tanukek
1
1) [ 2] (3) X M = X O- YOtan u M O M
YM = YO+ X Otan U M e M , (9)
where K = cosu&, . . . cos U N e N is a constant, N is the and the second stage is
number of subrotations, and uk E { -1,O, 1) determines
the direction of each subrotation [3]. X = X M - YM tanaL8L
Y = YM + X M tanuLeL.
2.1. The C O R D I C Algorithm Clearly, the second stage computation can be easily real-
From the digital computation viewpoint, CORDIC algo- ized by a sequence of shift-and-add operations in radix-2
rithm takes advantage of the property that decomposes numbering system because BL is small enough so that
a rotation into a sequence of subrotations to develop an the value of t a n 9 ~can be approximated by BL that is
approach that reaches the target angle e by applying a power of 2 [6]. However, the first stage computations
successive subrotations [3], [4]. In CORDIC algorithm, involve multiplying tanBM on the datapath, which be-
a positive angle 8 is represented in an N-bit binary num- comes the bottleneck of this method. These multiplica-
ber as tion operations are usually replaced by a ROM lookup
I table.
N

(4) 2.3. The Proposed Modified Coarse-Fine Rota-


k=l tion Method
where bk is the bit in the radix-2 numbering system with In order to reduce the size of the ROM lookup table in
the weight of 2-k and e k = 2-k represents the positional the coarse-fine rotation method, we treat 4 0 in (5) as
power-of-two weight. By applying the angle recoding the initial angle and decompose the coarse angle OM as
process, 8 can be rewritten as [5] follows:
-
N -
N

eM = X e M k =C ( e H k f 6Lk) (11)
k=2 k=2

where 40 is constant and rk E {-1,l). where 6 M k is defined as

8&Jk = 2-k, (12)


2.2. The Coarse-Fine Rotation Method
6Hk is the angle that should satisify
In this method, an arbitrary angle is first mapped from
the full range [0,27r]to B E [O,?] according to the quarter- taneHk = 2-k (13)
wave symmetry property of the sinelcosine functions. and
Then, the angle 0 is partitioned into two terms expressed
as [lo] eLk = 6 M k - e H k = 6Mk - arctan 2- k (14)
is treated as the error correction term. Substitute (11)
into (6), and the angle 6 can be rewritten as
where t9M is the coarse subangle that can be expressed -
N

as
k=2
3 N
= bkak (7)
k=l
k=2
and Or, is the fine subangle that can be expressed as
where BEL is defined as
N

0~ = bk2k. (8) 6CL


k=g+l k=2

IV-545
Number of bits Number of Address of Address of Number of
N address bits the first word the final word words
16 4 0000 1100 13
17 4 0000 1100 13
18 5 00000 11001 26
19 5 00000 11001 26
20 5 00000 11001 26
21 6 000000 110010 51
22 6 000000 110010 51
23 6 000000 110010 51
24 7 0000000 1100100 101
25 7 0000000 1100100 101
26 7 0000000 1100100 101

for convenience. With regard to e H k , it is inherently a e


/A-__
power of 2. As to &L, it may be either small enough
OM 9L
or with a carry in the f - t h stage. If it is small enough, .-
I LookupTable(R0M) 'I I shin&AddStages'l
of course it can be realized by a sequence of shift-and-
add operations; otherwise, the OH+ should be rotated Kcosa
again to realize the carry, and the remain of OcL can Ksina0 sin e
be realized by a sequence of shift-and-add operations.
As a whole, the rotation of OM in (6) can be realized b, 4 b, b. b,
by a sequence of shift-and-add operations in the radix-2 1 1 I 1 I
tan- tan- tan- tan- -
system. Therefore, neither the ROM lookup table nor 4 8 16 32 2"

the real multiplication operation is required.

I 13-word ROM I
3. A R C H I T E C T U R E
Address 1 X Y

3.1. The ROM-Based A r c h i t e c t u r e for the Coarse-


Fine Rotation Method

In the coarse-fine rotation method, the bottleneck is


1:;
..
m

.. .......

IIM)
.

/010110101I01l1l0I001l0
.

1
101 I 1 1111 11101010101010 00000011Ill I 1 I 1 1101010

ooo1 io1 I I 1 1 I101 10l01011~10 m101111111011000000

oIollolm1111M)11M)l
__-.____

i 1
that the requirement of multipliers to generate the prod- Figure 1: The ROM-based architecture for the coarse-
uct terms XOtan OM and YOtan OM where the value of fine rotation method [5].
taneM is itself needed to be computed. Figure 1 shows
a ROM-Based architecture [5]. In this architecturp, $0
in (5) is treated as the initial angle, and the information 3.2. The ROM-less A r c h i t e c t u r e for the Modi-
for the rotation of the coarse angle OM is built in a 13- fied Coarse-Fine Rotation Method
word ROM, each word contains one value of tanOM for
13 possible O M ' s . Obviously, as shown in Table 1, for the In the modified coarse-fine rotation method, since we de-
precision improvement, the number of bits used should composed e into the sum of ~ ; = ~ e Hand k ecL where
increase, the size of ROM will grows exponentially which tant?Hk is a power of 2 and OxL is small enough such
becomes a large overhead from the hardware implemen- that tani3cL can be approximated by OcL, we can re-
tation viewpoint. Figure 2 is a graphic representation alize the whole rotation by an all radix-2 shift-and-add
showing this trend. datapath architecture as shown in Fig. 3. In this fig-

IV-546
Number of X,Y values stored in ROM table or performing the real multiplication operations on
120 I 1 the datapath. The hardware realization can be arranged
5 100
in pipelined architecture t o speed up the operation. The
precision can be increased t o a satisfactory level with-
0
4 80 0
out a large hardware overhead. In the future, by suitably
configuring this design with other functional cores, such
: z i; : o20
as D/A, low-pass filter, we can implement a direct digital
frequency synthesizer.
16 17 18 19 20 21 22 23 24
Number of bits 5. REFERENCES

J. Vankka, “Methods of mapping from phase to


Figure 2: The size of ROM lookup table increases with sine amplitude in direct digital synthesis”, IEEE
the precision in the ROM-based architecture. Trans. Ultrasonics, Ferroelectrics, and Fkequency
Control, vol. 44, no. 2, pp. 526-534, Mar. 1997.
---e-eL
H. T. Nicholas, H. Samueli, and B. Kim, “The opti-
mization of direct digital frequency synthesizer per-
formance in the presence of finite word length ef-
fects”, in Proc. 42nd Annu. Freq. Contr. Symp.,
1988, pp. 357-363.
J . Volder, “The CORDIC trigonometric computing
technique”, IEEE Trans. Comput., vol. 8, pp. 330-
334, 1959.
J. Walther, “A unified algorithm for elementary
functions”, in Proc. Spring Joint Computer Conf.,
1971, pp. 379-385.
A. Madisetti, A. Y. Kwentus, and A. N. Willson,
Figure 3: The proposed ROM-less all radix-2 stages ar- Jr., “A 100-MHz ,16-b, direct digital frequency
chitecture for the modified coarse-fine rotation method. synthesizer with a 100-dBc spurious-free dynamic
range”, IEEE J. Solid-state Circuits, vol. 34, no. 8,
pp. 1034-1043, Aug. 1999.
ure, the multiplication of tan eMk that was implemented
S. Wang, V. Piuri, and E. E. Swartzlander, Jr., “Hy-
by a 13-word ROM lookup table as shown in Fig. 1 is
brid CORDIC algorithms”, IEEE Trans, Comput.,
replaced by f stages of shift-and-add operations. From ’
vol. 46, no. 11, pp. 1202-1207, Nov. 1997.
the hardware realization viewpoint, since the whole data
path are operated with shift-and-add operations without J. A. Lee and T. Lang, “Constant-factor redundant
any real multiplication or ROM lookup table, it is real CORDIC for angle calculation and rotation”, IEEE
suitable to realize the whole system in pipelined archi- Trans. Comput., vol. 41, pp. 1016-1025, Aug. 1992.
tecture. Therefore, the operating speed will be faster
than the ROM-based one, and the hardware overhead M. D. Ercegovac and T. Lang, “Redundant and
will be also smaller. on-line CORDIC: application to matrix triangular-
ization and SVD”, IEEE Trans. Comput., vol. 39,
pp. 725-740, June 1990.
4. CONCLUSION
N. Takagi, T. Asada, and S. Yajima, “Redundant
CORDIC methods with a constant scale factor for
We have proposed architecture to implement the CORDIC
sine and cosine computation” , IEEE Bans. Com-
algorithm in a more efficient manner in this paper. By
put., vol. 40, pp. 989-995, Sept. 1991.
decomposing the rotation angle into a sequence of sub-
angles that are either small enough or with a power-of-2 [lo] D. Fu and A. N. Willson, Jr., “A high-speed proces-
tangent amplitude, the whole rotation can be accom- sor for digital sine/cosine generation and angle ro-
plished by a sequence of shift-and-add operations in the tation”, in Proc. IEEE 32 Asilomar Conf. Signals,
radix-2 system instead of constructing the ROM lookup System, and Computers, Nov. 1988, pp. 177-181.

IV-547

You might also like