You are on page 1of 4

Multiplier design based on ancient Indian Vedic Mathematics

Honey Durga Tiwari, Ganzorig Gankhuyag, Chan Mo Kim, Yong Beom Cho
Dept. of Electronics Engineering Konkuk University Seoul, South Korea honeyndt@konkuk.ac.kr

AbstractVedic mathematics is the name given to the ancient Indian system of mathematics that was rediscovered in the early twentieth century from ancient Indian sculptures (Vedas). It mainly deals with Vedic mathematical formulae and their application to various branches of mathematics. The algorithms based on conventional mathematics can be simplified and even optimized by the use of Vedic Sutras. These methods and ideas can be directly applied to trigonometry, plain and spherical geometry, conics, calculus (both differential and integral), and applied mathematics of various kinds. In this paper new multiplier and square architecture is proposed based on algorithm of ancient Indian Vedic Mathematics, for low power and high speed applications. It is based on generating all partial products and their sums in one step. The design implementation on ALTERA Cyclone II FPGA shows that the proposed Vedic multiplier and square are faster than array multiplier and Booth multiplier. Keywords-Vedic Mathematics; Multiplier; Array Multiplier; Square Architectur.

[3].In the CSA method, bits are processed one by one to supply a carry signal to an adder located at a one bit higher position. This is in fact much similar to a manual calculation method, where the layout thereof corresponds to the logic and is regular, and hence the design of layout is easy. The CSA method has its own limitation since an execution time depends upon the number of bits of the multiplier; there is some difficulty in achieving high speed operation [3]. In the Wallace tree method, three bit signals are passed to a one bit full adder (3W) which is called a three input Wallace tree circuit, and the output signal (sum signal ) is supplied to the next stage full adder of the same bit, and the carry output signal thereof is passed to the next stage full adder of the same no of bit, and the carry output signal thereof is supplied to the next stage of the full adder located at a one bit higher position. In the Wallace tree method, the circuit layout is not easy although the speed of the operation is high since the circuit is quite irregular. Another improvement in the multiplier is by reducing the numbers of partial products generated. The Booth recording multiplier is one such multiplier; it scans the three bits at a time to reduce the number of partial products [4]. These three bits are: the two bit from the present pair; and a third bit from the high order bit of an adjacent lower order pair. After examining each triplet of bits, the triplets are converted by Booth logic into a set of five control signals used by the adder cells in the array to control the operations performed by the adder cells. The method of Booth recording reduces the numbers of adders and hence the delay required to produce the partial sums by examining three bits at a time. The high performance of Booth multiplier comes with the drawback of power consumption. The reason for this is the large number of adder cells (15 cells for 8 rows-120 core cells) that consume power [4, 5, 6, 7]. The conclusion is that the current methodology of multiplication leads to more consumption of power and reduction in efficiency. This paper proposes a multiplier and square architecture providing the solution of the aforesaid problems adopting the sutra of Vedic Mathematics called Urdhva Tiryakbhyam (Vertically and Cross wise)[8,9,10]. It can be shown that the design is highly efficient in terms silicon area/speed. Hence , a preferred choice for DSP algorithms like FFT, DCT used in image processing standards like MPEG codec etc.

I.

INTRODUCTION (HEADING 1)

Digital multipliers [1], [2] are the core components of all the digital signal processors (DSPs) and the speed of the DSP is largely determined by the speed of its multipliers [3]. They are indispensable in the implementation of computation systems realizing many important functions such as fast Fourier transforms (FFTs) and multiply accumulate (MAC). Multiplication can be implemented using several algorithms such as: array, Booth, carry save, modified Booth algorithms and Wallace tree. A number of interesting parallel and serialparallel multiplier architectures have been proposed based on aforesaid algorithm which improve the cost-throughput efficiency. In an array multiplier multiplication of two binary numbers can be obtained with one micro-operation by using a combinational circuit that forms the product bits all at once thus making it a fast way of multiplying two numbers since the only delay is the time for the signals to propagate through the gates that form the multiplication array. However, an array multiplier requires a large no gates and for this reason it is less economical [2]. The other aspect of improving the multiplier efficiency is through the arrangement of adders. As methods of arrangement of adders are concern, there are two methods: a carry save array (CSA) method and a Wallace tree method

Our thanks go to IDEC, IITA/ETRI SoC Industry Promotion Center and Seoul R&BD Program for providing funds for the research work. We also thank Korea Ministry of Knowledge Economy for supporting this project under "System IC 2010" project.

978-1-4244-2599-0/08/$25.00 2008 IEEE

II-65

2008 International SoC Design Conference

II.

VEDIC FORMULAE

A. Vedic Sutras The word Vedic is derived from the word veda which means the store-house of all knowledge. Vedic mathematics is mainly based on 16 Sutras (or aphorisms) dealing with various branches of mathematics like arithmetic, algebra, geometry etc. [8]. These Sutras along with their brief meanings are enlisted below alphabetically. 1. 2. 3. 4. 5. 6. 7. 8. 9. (Anurupye) Shunyamanyat If one is in ratio, the other is zero Chalana-Kalanabyham Differences and Similarities. Ekadhikina Purvena By one more than the previous one Ekanyunena Purvena By one less than the previous one Gunakasamuchyah The factors of the sum is equal to the sum of the factors Gunitasamuchyah The product of the sum is equal to the sum of the product Nikhilam Navatashcaramam Dashatah All from 9 and the last from 10 Paraavartya Yojayet Transpose and adjust. Puranapuranabyham noncompletion By the completion or

shown in the figure. The square is divided into rows and columns where each row/column corresponds to one of the digit of either a multiplier or a multiplicand. Thus, each digit of the multiplier has a small box common to a digit of the multiplicand. These small boxes are partitioned into two halves by the crosswise lines. Each digit of the multiplier is then independently multiplied with every digit of the multiplicand and the two-digit product is written in the common box. All the digits lying on a crosswise dotted line are added to the previous carry. The least significant digit of the obtained number acts as the result digit and the rest as the carry for the next step. Carry for the first step (i.e., the dotted line on the extreme right side) is taken to be zero.

10. Sankalana-vyavakalanabhyam By addition and by subtraction 11. Shesanyankena Charamena The remainders by the last digit 12. Shunyam Saamyasamuccaye When the sum is the same that sum is zero 13. Sopaantyadvayamantyam The ultimate and twice the penultimate 14. Urdhva-tiryakbyham Vertically and crosswise 15. Vyashtisamanstih Part and Whole 16. Yaavadunam Whatever the extent of its deficiency The study of these formulae is a field of diverse study. The proposed design uses only Urdhva-tiryakbyham method hence the detailed description of other formulae is beyond the scope of this paper. B. Urdhva Tiryakbhyam Sutra Urdhva tiryakbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means Vertically and Crosswise. To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (5498 2314). The conventional methods already know to us will require 16 multiplications and 15 additions. An alternative method of multiplication using Urdhva tiryakbhyam Sutra is shown in Fig. 1. The numbers to be multiplied are written on two consecutive sides of the square as

Figure 1. Alternative way of multiplication by Urdhva tiryakbhyam Sutra.

C. Urdhva Tiryakbhyam Sutra for binary number system In this section we extend this Sutra to binary number system. To illustrate the multiplication algorithm, let us consider the multiplication of two binary numbers a3a2a1a0 and b3b2b1b0. As the result of this multiplication would be more than 4 bits, we express it as r3r2r1r0. Line diagram for multiplication of two 4-bit numbers is shown in Fig. 2 which is nothing but the mapping of the Fig. 1 in binary system. For the sake of simplicity, each bit is represented by a circle. Least significant bit r0 is obtained by multiplying the least significant bits of the multiplicand and the multiplier. The process is followed according to the steps shown in Fig. 2. As in the last case, the digits on the both sides of the line are multiplied and added with the carry from the previous step. This generates one of the bits of the result (rn) and a carry (say cn). This carry is added in the next step and hence the process goes on. If more than one line are there in one step, all the results are added to the previous carry. In each step, least significant bit acts as the result bit and the other entire bits act as carry. For example, if in some intermediate step, we get 110, then 0 will act as result bit and 11 as the carry (referred to as cn in this text). It should be clearly noted that cn may be a multi-bit number. Thus we get the following expressions: r0 = a0b0; (1) c1r1 = a1b0 + a0b1; (2) c2r2 = c1 + a2b0 + a1b1 + a0b2; (3) c3r3 = c2 + a3b0 + a2b1 + a1b2 + a0b3; (4)

II-66

2008 International SoC Design Conference

c4r4 = c3 + a3b1 + a2b2 + a1b3; (5) c5r5 = c4 + a3b2 + a2b3; (6) c6r6 = c5 + a3b3 (7) with c6r6r5r4r3r2r1r0 being the final product.

As shown in Fig. 3, we write the multiplier and the multiplicand in two rows followed by the differences of each of them from the chosen base, i.e., their compliments. We can now write two columns of numbers, one consisting of the numbers to be multiplied (Column 1) and the other consisting of their compliments (Column 2). The product also consists of two parts which are demarcated by a vertical line for the purpose of illustration. The right hand side (RHS) of the product can be obtained by simply multiplying the numbers of the Column 2 (74 = 28). The left hand side (LHS) of the product can be found by cross subtracting the second number of Column 2 from the first number of Column 1 or vice versa, i.e., 96 - 7 = 89 or 93 - 4 = 89. The final result is obtained by concatenating RHS and LHS (Answer = 8928). After this illustration, we now discuss the operational principle of Nikhilam Sutra by taking the case of multiplication of two nbit numbers a and e having compliments = 10n a and = 10n - e respectively. The required product p is defined as:

Figure 2. Line diagram for multiplication of two 4-bit numbers.

p = ae; (8) which can be reframed be adding and subtracting 102n + 10 (a + e) to the right hand side as:
n

Hence this is the general mathematical formula applicable to all cases of multiplication. The hardware design for this algorithm will be very similar to that of the famous array multiplier where an array of adders is required to arrive at the final product. All the partial products are calculated in parallel and the delay associated is mainly the time taken by the carry to propagate through the adders which form the multiplication array. Clearly, this is not an efficient algorithm for the multiplication of large numbers as a lot of propagation delay is involved in such cases. To deal with this problem, we now discuss Nikhilam Sutra which presents an efficient method of multiplying two large numbers. D. Nikhilam Sutra Nikhilam Sutra literally means all from 9 and last from 10. Although it is applicable to all cases of multiplication, it is more efficient when the numbers involved are large. It finds out the compliment of the large number from its nearest base to perform the multiplication operation on it, hence larger the original number, lesser the complexity of the multiplication. We first illustrate this Sutra by considering the multiplication of two decimal numbers (96 93) where the chosen base is 100 which is nearest to and greater than both these two numbers.

p = ae + 102n - 102n + 10n(a + e) - 10n(a + e) (9) The above terms can be clubbed as follows: p = {10n(a + e) - 102n} + {102n - 10n(a + e) + ae} = 10n{(a + e) - 10n} + {(10n - a)(10n - e)} = 10n{a } + {} = 10n{e } + {} (10) From (10), the expressions of LHS and RHS can be deduced, which come out to be: LHS = {a } = {e }; (11) RHS = {}; (12) Hence the multiplication of two n- bit numbers is reduced to the multiplication of their compliments. To take full advantage of this reduction, it should be ensured that the numbers obtained after taking the compliments are lesser than the original numbers. This condition is satisfied if both the original numbers are greater than 10n/2, i.e., a > 10n/2 and e > 10n/2. This is the reason why it is said that the Nikhilam Sutra is more efficient in the multiplication of large numbers than the smaller ones. An important point to note here is the number of digits required in the RHS of the product. From (10), it is clear that RHS should have n digits irrespective of number of digits in the product . We illustrate this point by considering a special case of the multiplication of two 2- digit numbers in which RHS comes out to be a single digit (9997). As shown in Fig. 4, the LHS of the product comes out to be 99 - 3 = 97 - 1 = 96 and the RHS comes out to be 3 1 = 3. As n = 2 in this case, we need to append a leading zero to the RHS making it to be 03. The final result thus comes out to be 9603. On the other hand, if the number of digits in RHS would have been three, then the most significant digit would be the carry digit to LHS.

Figure 3. Line diagram for multiplication of two 4-bit numbers.

II-67

2008 International SoC Design Conference

IV.

CONCLUSION

Figure 4. Line diagram for multiplication of two 4-bit numbers.

III.

IMPLEMENTATION AND RESULTS

The above mentioned method was used to implement 2, 3, 4, 8 bit multiplier. Table 1 shows the number of calculations required for various bit lengths.
TABLE I. COMPUTATIONAL COMPLEXITY OF CONVENTIOANL ARRAY MULTIPLIER AND VEDIC MULTIPLIER Number of calculations Input Bit Length 2 3 4 8 4 9 16 64
Conventional Vedic

A 2 7 15 77

M 4 9 16 64

A 1 5 9 53

A new reduced-bit multiplication algorithm based on a formula of ancient Indian Vedic mathematics has been proposed. Both the Vedic multiplication formulae, Urdhva tiryakbhyam and Nikhilam, have been investigated in detail. Urdhva tiryakbhyam, being general mathematical formula, is equally applicable to all cases of multiplication. A multiplier architecture based on this Sutra has been developed and is seen to be similar to the popular array multiplier where an array of adders is required to arrive at the final product. Due to its structure, it suffers from a high carry propagation delay in case of multiplication of large numbers. This problem has been solved by introducing Nikhilam Sutra which reduces the multiplication of two large numbers to the multiplication of two small numbers. The framework of the proposed algorithm is taken from this Sutra and is further optimized by use of some general arithmetic operations such as expansion and bit shifting to take full advantage of bit-reduction in multiplication. The computational efficiency of the algorithm has been illustrated by reducing a general 4 4-bit multiplication to a single 2 2bit multiplication operation. The FPGA implementation result shows that the delay and the area required in proposed design is far less than the conventional booth and array multiplier designs making them efficient for the use in various DSP applications. REFERENCES
[1] [2] [3] [4] [5] [6] K. Hwang, Computer Arithmetic: Principles, Architecture And Design. New York: John Wiley & Sons, 1979. M. M. Mano, Computer System Architecture. Englewood Cliffs, NJ: Prentice-Hall, 1982. Gensuke Goto,High Speed Digital Parallel Multiplier, United States Patent-5,465,226, November 7 1995. A.D. Booth, A Signed Binary Multiplication Technique, Qrt. J. Mech. App. Math.,, vol. 4, no. 2, pp. 236240, 1951. G. Goto. High Speed Digital Parallel Multiplier. U. S. Patent 5 465 226, Nov. 7, 1995. L. Ciminiera and A. Valenzano, Low Cost Serial Multiplier for High Speed Specialised Processors, IEE Proc., vol. 135, no. 5, pp. 259265, Sept. 1988. Tam Anh Chu, Booth Multiplier with Low Power High Performance Input Circuitry, US Patent,, 6,393,454 B1, May 21 2002. Jagadguru Swami Sri Bharath, Krsna Tirathji, Vedic Mathematics or Sixteen Simple Sutras From The Vedas, Motilal Banarsidas, Varanasi(India),1986. A.P. Nicholas, K.R Williams, J. Pickles, Application of Urdhava Sutra, Spiritual Study Group, Roorkee (India),1984 A.P. Nicholas, K.R Williams, J. Pickles, Lectures on Vedic Mathematics, Spiritual Study Group, Roorkee (India),1982. B. K. Tirtha, Vedic Mathematics. Delhi: Motilal Banarsidass Publishers, 1965. H. Thapliyal and M. B. Srinivas, High Speed Efficient N N Bit Parallel Hierarchical Overlay Multiplier Architecture Based on Ancient Indian Vedic Mathematics, Enformatika Trans., vol. 2, pp. 225-228, Dec. 2004. H. Thapliyal, M. B. Srinivas and H. R. Arabnia , Design And Analysis of a VLSI Based High Performance Low Power Parallel Square Architecture, in Proc. Int. Conf. Algo. Math. Comp. Sc., Las Vegas, June 2005, pp. 7276.

M: Number of multiplications, A: Number of additions

The 8 bit multiplier was implemented on ALTERA Cyclone-II FPGA and it utilized 231 combinational functions. The worst case propagation delay in this case was found to be 27ns making the making operating frequency as 37 MHz. to compare it with other implementations the design was synthesized on XILINX:SPARTAN:S30VQ100:-4. Table 2 shows the synthesis result for various implementations. The result obtained from proposed Vedic multiplier is faster than array multiplier and Booth multiplier. In both the methods suggested earlier the multiplication of individual bits are done in parallel and the addition follows the multiplication process. The use of carry look ahead adder will make the addition process and carry generation faster. There by reducing the delay associated.
TABLE II. COMPUTATIONAL COMPLEXITY OF CONVENTIOANL ARRAY MULTIPLIER AND VEDIC MULTIPLIER Array FMAP HMAP DELAY 150 10 43 Booth 283 49 124 Vedic 201 26 12

[7] [8]

[9] [10] [11] [12]

Device 1 XILINX: SPARTAN: S30VQ100: -4

[13]

II-68

2008 International SoC Design Conference

You might also like