This action might not be possible to undo. Are you sure you want to continue?

# MITSUBISHI ELECTRIC ITE VI-Lab

Internal Reference: Publication Date:

Title:

Double Precision Floating-Point Arithmetic on FPGAs

VIL04-D098 Dec. 2003

Author: Reference:

S. Paschalakis, P. Lee

Rev.

A

Paschalakis, S., Lee, P., “Double Precision Floating-Point Arithmetic on FPGAs”, In Proc. 2003 2nd IEEE International Conference on Field Programmable Technology (FPT ’03), Tokyo, Japan, Dec. 15-17, pp. 352-358, 2003

**Double Precision Floating-Point Arithmetic on FPGAs
**

Stavros Paschalakis, Peter Lee

Abstract We present low cost FPGA floating-point arithmetic circuits for all the common operations, i.e. addition/subtraction, multiplication, division and square root. Such circuits can be extremely useful in the FPGA implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit. While previous work has considered circuits for low precision floating-point formats, we consider the implementation of 64-bit double precision circuits that also provide rounding and exception handling.

© 2004 Mitsubishi Electric ITE B.V. - Visual Information Laboratory. All rights reserved.

The values E=0 and E=2047 are reserved for special quantities.2046] for the biased exponent E. All the operators presented here provide . a problem that is frequently encountered with FPGA-based system-on-achip solutions. the most basic format of the ANSI/IEEE 754-1985 binary floating-point arithmetic standard [8]. The IEEE standard specifies that double precision floating-point numbers are comprised of 64 bits. The hidden significand bit is also 0 and not 1. which is most commonly used in scientific computations. the representation is said to be “normalised”. 1. normalisation and classification of moment descriptors and relies partly on custom parallel processing structures and partly on floating-point processing.f. A detailed description of the system is not given here but can be found in [9]. is commonly referred to as the “hidden bit”.e.g. In this paper we consider the implementation of FPGA floating-point arithmetic circuits for all the common operations. The number zero is represented with E=0. For double precision numbers. 16 or 18 bits in total.g. e. We have used these circuits in the implementation of a high-speed object recognition system which performs the extraction. in signal processing or computer vision applications. with s=0 or s=1. For the same reason. e. require general purpose arithmetic processing units which are not standard components of the FPGA device.10]. i. When the MSB of the significand is 1 and is followed by the radix point. addition/subtraction.Paschalakis@vil. The hidden bit is 0 and not 1 and the sign is determined as for normal numbers.Lee@kent. The earliest work considered the implementation of operators in low precision custom formats. in order to reduce the associated circuit costs and increase their speed.2). Introduction FPGAs have established themselves as invaluable tools in the implementation of high performance systems. is that the algorithmic frameworks of most real-world problems will. Peter Lee University of Kent at Canterbury E-mail: P. The fraction f represents a number in the range [0.f and is in the range [1. Zero has a positive or negative sign like normal numbers. a real number N is represented in terms of a sign s. More recently. various researchers have examined the FPGA implementation of floating-point operators [1-7] to alleviate this problem. at some point.uk rounding and exception handling. Such numbers are referred to as “denormalised”. In a floating-point representation system of a radix β. E is an unsigned biased number and the true exponent e is obtained as e=E–Ebias with Ebias=1023. our circuits do not support denormalised numbers. and also consider features such as rounding and exception handling. Floating-Point Numerical Representation This section examines only briefly the double precision floating-point format. More details and discussions can be found in [8. Because of the additional complexity and costs. Therefore.ac. The leading 1 of the significand.1023]. addition/subtraction.e. this part of the standard is not commonly implemented in hardware.e. Such circuits can be extremely useful in the FPGA implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit. the range of the unbiased exponent e is [–1022.ite. However. a process usually referred to as “unpacking”. multiplication. An exponent E=2047 and a fraction f=0 represent infinity. division and square root.mee. This is usually made explicit for operations. 11 bits for the exponent E (bits 62 down to 52) and 52 bits for the fraction f (bits 51 to 0). While previous work has considered circuits for low precision floating-point formats. a sign bit (bit 63). the increasing size of FPGA devices allowed researchers to efficiently implement operators in the 32bit single precision format. we consider the implementation of 64-bit double precision circuits that also provide rounding and exception handling. which translates to a range of only [1.1) and the significand S is given by S=1. combining the reprogrammability advantage of general purpose processors with the speed and parallel processing advantages of custom hardware. i.Double Precision Floating-Point Arithmetic on FPGAs Stavros Paschalakis Mitsubishi Electric ITE BV VI-Lab E-mail: Stavros. an exponent e and a significand S so that N=(–1)s·β e·S. and f=0. multiplication.com Abstract We present low cost FPGA floating-point arithmetic circuits for all the common operations. in the 64-bit double precision format. When E=0 and f≠0 then the number has e=–1022 and a significand S=0. i. division and square root. 2.

The IEEE standard specifies four rounding modes. Adder 675 (5. which may require ER to be readjusted. A modified 6-stage barrel shifter wired for alignment shifts performs the alignment. Also. according to the effective operation. This significand comparison deviates from the generic algorithm given earlier but has certain advantages. Swapping SA and SB is equivalent to swapping A and B and making an adjustment to the sign sR. The absolute difference |EA–EB| is calculated using two cascaded adders and a multiplexer. “fast ripple-carry” will always refer to such adders). i. infinity or NaN are performed. “round” (R) and “sticky” (S). and round SR. calculate the absolute value of the difference of the two exponents. with a fixed latency of three clock cycles. For now we can assume that neither operand is infinity or NaN. truncation takes place. an exponent E=2047. as “guard” (G). The advantage of swapping the significands is that it is always SB which will undergo the alignment shift. This is implemented by extending the relevant significands by three bits beyond their LSB (L) [10]. Double precision floating-point operator statistics on a XILINX® XCV1000 Virtex™ FPGA device*.49%) 336 1. 193 for divider and 129 for square root. If B>A then the significands SA and SB are swapped Both significands are then extended to 56 bits.426 343 400 463 6. a note should be made on the issue of rounding. i. “Rounding up”. Because addition is most frequent in scientific computations. In other cases. Here.e. The barrel shifter is organised so that the 32-bit stage is followed by the 16-bit stage and so on. by adding a 1 to the L bit. All the comparators employ the fast carry logic of the device. only the SB path needs a shifter. Add or subtract the two significands SA and SB.000…001. R and S bits as discussed earlier. only one NaN is provided with E=2047 and f=.118 10. The significand comparator was implemented using seven 8-bit comparators that operate in parallel and an additional 8-bit comparator which processes their outputs. only the default mode is considered.03%) Divider (2.e. the effective operation when both operands are made positive is determined. which is produced by operations like 0/0 and 0·∞.79%) Square Root 347 (2. shift right the significand which corresponds to the smaller exponent by |EA–EB| places. allowing the implementation of multiple NaNs. Normalise SR. The standard does not specify any NaN values.464 Device utilisation figures include I/O flip-flops: 194 for adder.82%) 316 399 5.Table 1. Then. Here. |EA–EB|. From this point. The sign of infinity is determined as for normal numbers. is performed when (i) G=1 and R∨S=1 for any L or (ii) G=1 and R∧S=0 for L=0.334 Multiplier (4. which is the most difficult to implement and is known as “round-to-nearest-even” (RNE). each stage calculates the OR of the bits that are shifted out and cleared. (–|A|)–(–|B|) becomes –(|A|–|B|). This swapping requires only multiplexers. adjusting ER as appropriate. which results in the same effective operation but with a sign inversion of the result. These bits are referred to. This allows the sticky bit S to be calculated as the OR of these six “partial” sticky bits along with the value that is in the sticky bit position of the output pattern. from the most significant to the least significant. In the second cycle. Each stage in the barrel shifter can clear the bits which rotate back from the MSB to achieve the alignment.g. which is required if EA=EB. the operands A and B are unpacked and checks for zero. The fist two are normal extension bits. Based on the sign bits sA and sB and the original operation.366 Slices Slice flip-flops 4-input LUTs Total equivalent gate count * 495 460 604 8. and this provisionally becomes the exponent ER of the result. our circuit aims at a low implementation cost combined with a low latency. Hence. e. Finally. and set the exponent ER of the result to the value of the larger of the two exponents. this procedure is quite generic and various modifications exist. 193 for multiplier. the significand alignment shift is performed and the effective operation is carried out. Addition/Subtraction The main steps in the calculation of the sum or difference R of two floating-point numbers A and B are as follows. Its overall organisation is shown in Figure 1. and a fraction f≠0 represent the symbolic unsigned entity NaN (Not a Number). It is clear that arithmetic operations on the significands can result in values which do not fit in the chosen representation and need to be rounded. so that key components may be reused. and make the result positive if it is negative. using the dedicated carry logic of the device (here. The circuit is not pipelined. by the G. so that the large ORs do not become a significant factor with . Clearly. Implicit in this is also the identification of the larger of the two exponents. The relation between the two operands A and B is determined based on the relation between EA and EB and by comparing the significands SA and SB. it can be assumed that both A and B are positive. Finally. First. 3. and are stored in registers. while the last one is the OR of all the bits that are lower than the R bit. as will be seen. In the first cycle. Both adders are fast ripple-carry adders.

i. B = ∞ . For the first two cases.e. then it passes straight through the shifter unmodified. a supernormal SR in the range [2. This relies on the same hardware used for the processing of EA and EB in the first cycle. ER is reduced by the left shift amount required for the SR normalisation. The two results are multiplexed. SR is normalised by the alignment barrel shifter. However. Each time SR requires normalisation. with SR on the SB path The normalised and rounded SR is given by the top 53 bits of the result. Then. Another check that is performed is for an effective subtraction with A=B. The result of this operation is the provisional significand SR of the result and is routed back to the SB path. The rounding addition is performed by the significand adder. which is wired for right shifts. the result of the first adder is the correct ER.e.000…000. If SR is subnormal. ER passes through unmodified.Operation A Unpack B LSB LSB+1 LSB Leading-1 Pattern A . The second cascaded adder increments this result by 1. If the rounded SR is supernormal then the result of the second adder is the correct ER. Checks are also performed on the final ER to detect an overflow or underflow. inapplicable. there are three cases: (a) SR is normalised (b) SR is subnormal and requires a left shift by one or more places and (c) SR is supernormal and requires a right shift by just one place. Otherwise. the leading-1 detector produces a zero shift size and SR is treated as a normalised significand. Leading-1 detection Align Significand Add/Subtract SR Leading 1 Detection Exponent Adjust ER Rounding Control Normalise SR Exponent Adjust ER Pack R Round SR Figure 1. The calculation of the sign sR is performed in the first cycle and is trivial. whereby R is forced to the appropriate patterns before packing. obviously. This case is easily detected by monitoring the carry-out of the significand adder. feeding a 1 at the MSB. The shifter output is the 56-bit significand SB. according the IEEE standard. If SR is supernormal. The output of the shifter is the normalised 56-bit SR with the correct values in the G. of which MSB will become hidden during packing. ER is incremented by 1. If SR is subnormal. i. More specifically. if SR=0 then normalisation is. A complication that could arise is the rounding addition producing the significand SR=10. Floating-point adder with respect to the speed of the shifter. NaN Input Pattern sA sB EA EB EB>EA SA SB MSB–1 MSB Effective Operation Exponent Swap Difference EB=EA Logic ER |EA–EB| B=A MSB–1 MSB MSB MSB–1 B>A Swap MSB Sign Logic Effective Operation B=A sR Figure 2. If SR is normalised. R and S positions. One adder performs the adjustment arising from the normalisation of SR. aligned and with the correct values in the G. the output of the leading-1 detector is overridden. no actual normalisation takes place because the bits MSB–1 down to L already represent the correct fraction fR for packing. The advantage of the significand comparison earlier is that the result of this operation will never be negative. rounding is performed as discussed earlier.4). whereby R is set to a positive zero. Finally. it is shifted right and rotated by the appropriate number of places so that the normalisation left shift can be achieved. Figure 2 shows a straightforward design for a leading-1 detector. Finally. It is clear that SR will not necessarily be normalised. a leading-1 detection component examines SR and calculates the appropriate normalisation shift size. The 56-bit leading-1 detector is comprised of seven 8-bit components and some simple connecting logic. it is right shifted by one place. and the sticky bit S is recalculated. R and S positions. setting aside SR=0. requiring only a few logic gates. A fast ripple-carry adder then calculates either SA+SB or SA–SB according to the effective operation. infinity or NaN operands result in an . If SR is supernormal. If SR is normalised. The normalisation of SR takes place in the third and final cycle of the operation. since SA≥SB after alignment. the exponent ER needs to be adjusted. In this case. equal to the number of 0 bits that are above the leading-1. MSB down to the L bit. For the final normalisation case.

In the first CLK1 cycle. adjusting ER as appropriate. In the second CLK1 cycle. Unlike the floating-point adder which operates on a single clock. and round SR. The normalisation of a supernormal SR is achieved . which is twice as fast as the primary clock and is used by the significand multiplier. the sum EA+EB is calculated using a fast ripple-carry adder. Thus. For now we can assume that neither operand is zero. a primary clock (CLK1). if y1=0. our double precision floating-point multiplier aims at a low implementation cost while maintaining a low latency. Taking into account the logic levels associated with the generation of the partial products required by Booth’s method and the carry-save additions. the circuit is quite small. The results of these partial assimilations need not be stored. ∞ .4). where y56+ is the OR of y56 and y57+. Table 1 shows the implementation statistics of the double precision floating-point adder on a XILINX® XCV1000 Virtex™ FPGA device of –6 speed grade. SR will initially be 106 bits long and in the range [1. the latency of this component is so small that it has no effect on the clocking of the carry-save array. For the 53-bit SA and SB. 14 CLK2 cycles are required to produce all the sum and carry bits. until the end of the eighth CLK1 cycle. The significand multiplier is based on the Modified Booth’s 2-bit parallel multiplier recoding method and has been implemented using a serial carry-save adder array and a fast ripple-carry adder for the assimilation of the final carry bits into the final sum bits. taking a carry-in from the previous partial assimilation. Since both SA and SB are 53-bit normalised numbers. These are not included here but can be found in [9]. Bits y1 to y56 are produced by this final carry assimilation. the operands A and B are unpacked and checks are performed for zero. relative to the scale of the significand multiplication involved. These figures also include 194 I/O synchronization registers. this gives rise to a performance of ~8.9% logic and routing delays respectively. this circuit operates on two clocks. If y1=1. Now. this contains two cascaded carry-save adders. infinity or NaN operands. The circuit is not pipelined and with a latency of ten cycles. processes the four sum and two carry bits produced in the previous carry-save addition cycle. all we have is their OR. Multiply SA and SB to obtain the significand SR of the result. taking a carry-in form the last partial assimilation. B = 0 . From this point. which we can write as y57+. which retire four sum and two carry bits in each CLK2 cycle. The circuit can operate at up to ~25MHz.4). all that needs to be stored is their OR. This completes the calculation of ER.e.y3y4y5…y104y105y106. since they would eventually have been ORed into the sticky bit S. As the first step in the calculation of ER. while it greatly accelerates the speed of the subsequent carry-propagate addition. Normalise SR. Figure 3. the critical path lying on the significand processing path and comprised of 41. the excess bias is removed from EA+EB using the same adder that was used for the exponent processing of the previous cycle.e. the circuit is small enough to allow multiple instances to be incorporated in a single FPGA if required. to which the ten cycle latency corresponds. i. infinity or NaN. the final sum and carry bits produced by the carry-save array are added together. Obviously. NaN sA sB EA EB SA B SB Sign Logic sR Exponent Add EA+EB Remove Excess Bias ER Exponent Adjust ER Exponent Adjust ER Pack R Significand Multiply SR Normalise SR Round SR Rounding Control 4. The significand multiplication also begins in the second CLK1 cycle. a 4-bit fast ripple-carry adder performs a “partial assimilation”. Since the design is not pipelined and has a latency of three cycles. and the final 56-bit SR for rounding is given by y1.33 MFLOPS. Figure 3 shows the overall organisation of this circuit. Alongside the carry-save array. Multiplication The most significant steps in the calculation of the product R of two numbers A and B are as follows. Bits y57 to y106 don’t exist as such.y2y3y4…y53y54y55y56+. SR and requires a 1-bit right shift for normalisation.infinity or NaN value for R according to a set of rules. A Unpack A . multiplication is the most frequent operation in scientific computing. calculated during the partial assimilations of the previous cycles.49% usage. Floating-point multiplier With respect to the carry-save array. i. then SR is normalised and the 56-bit SR for rounding is given by y2.1% and 58. The sign sR of the result is easily determined as the XOR of the signs sA and sB. In the ninth CLK1 cycle. A and B can be considered positive. it can be written as y1y2. At 5. which may require ER to be readjusted.y3y4y5…y54y55y56y57+. Calculate the exponent of the result as ER=EA+EB–Ebias. Since SR is in the range [1. and an internal secondary clock (CLK2). After addition and subtraction.

we can assume that neither operand is zero. In the first cycle. In the second cycle.e. i. its critical path comprised of 36. a fast ripple-carry adder. B = 0 . Floating-point divider aims solely at a low implementation cost. The divisor SB is subtracted from the remainder for the calculation of MSB-1 and so on. The division algorithm employed here is the simple non-performing sequential algorithm and the division proceeds as follows. incorporating an economic significand divider. This adjustment is performed at the beginning of this cycle and then the correct ER is chosen between the previous ER or the adjusted ER based on the final SR. The sign sR of the result is the XOR of sA and sB. the bias is added to EA–EB. zero. i. incremented by 1. In the tenth and final CLK1 cycle. The circuit is quite small. Table 1 shows the implementation statistics of the double precision floating-point multiplier.6% logic and routing delays respectively. NaN sA sB EA EB SA B SB Sign Logic sR Exponent Subtract EA–EB Add Bias ER Exponent Adjust ER Significand Divide SR Normalise SR Rounding Control Exponent Adjust ER Pack R Round SR Figure 4. the complication that might arise is a supernormal SR after rounding. As before. the correct decision is chosen once y1 has been calculated.using multiplexers switched by y1. If SR is supernormal. The divisor SB is subtracted from the remainder. Calculate the exponent of the result as ER=EA–EB+Ebias. the difference EA–EB is calculated using a fast ripple-carry adder. First. and round SR. Our double precision floating-point divider . Since both SA and SB are 5. a frequency of 37MHz and 74MHz for CLK1 and CLK2 respectively gives rise to a performance in the order of 3. The significand divider calculates one SR bit per cycle and its main components are two registers for the remainder and the divisor. ∞ . and produces a 55-bit SR. The most significant steps in the calculation of the quotient R of two numbers A (the dividend) and B (the divisor) are as follows. This adjustment does take place after it is determined that SR is supernormal but performed at the beginning of this cycle and then the adjusted ER either replaces the old ER or is discarded. infinity or NaN value for R according to a simple set of rules. Otherwise. Then. which may require ER to be readjusted. This completes the calculation of ER. ER is adjusted. infinity or a NaN operands result to a zero. adjusting ER as appropriate. i. In cycle 57. A non-pipelined design is adopted. incremented by 1.e.7 MFLOPS. Divide SA by SB to obtain the significand SR. Normalise SR. the circuit is small enough to allow multiple instances to be placed in a single chip. no actual normalisation is needed because it would not change the fraction fR for packing The exponent ER.4% and 63. That is. Obviously. occupying only 4. with a fixed latency of 60 clock cycles. Finally. while the secondary clock CLK2 can be set to a frequency of up to ~75MHz. division is a much less frequent than the previous operations. the MSB of SR is 0 and the remainder is not replaced. its critical path comprised of 36. a rounding decision is reached for both a normal and a supernormal SR. the rounding of SR is performed using the same fast ripple-carry adder that is used by the significand multiplier. the sticky bit S is calculated as the OR of all the bits of the final remainder.2% logic and routing delays respectively. using the same exponent processing adder of the previous cycle. The primary clock CLK1 can be set to a frequency of up to ~40MHz. Division In general. If the result is positive or zero. Since the circuit is not pipelined with a fixed latency of ten CLK1 cycles. the MSB of the quotient SR is 1 and this result replaces the remainder. The significand division also begins in the second cycle. The figures also include 193 I/O synchronization registers. and a shift register for SR. The result of this addition is the final normalised and rounded 53-bit significand SR. Figure 4 shows the overall organisation of this circuit. Checks are also performed on the final ER to detect an overflow or underflow. whereby R is forced to the correct bit patterns before packing. The remainder is then shifted left by one place.. the remainder of the division is set to the value of the dividend SA. A Unpack A . The divider operates for 55 cycles.8% and 63. infinity or NaN. As for the adder of the previous section. using the same exponent processing adder. The rounding decision is also made in the ninth CLK1 cycle and without waiting for the final carry assimilation to finish.. using the same adder that was previously used for the exponent processing. As the first step in the calculation of ER. the operands A and B are unpacked For now.03% of the device. the two least significant bits being the G and R bits. however.e. is adjusted. during the cycles 2 to 56.

With the circuit considered here. R will also be zero.2% logic and routing delays respectively. For a supernormal SR after rounding no normalisation is actually required but the exponent ER is incremented by 1 and this takes place in cycle 60 and using the same adder that is used for the exponent processing of the previous cycles.e. As for the previous circuits. since SR is already stored in a shift register.3.y2y3y4…. Checks are also performed on ER for an overflow or underflow.e. this gives a performance in the order of 1 MFLOPS. Thus. The last two bits are the guard bit G and the round bit R. the sticky bit S is calculated as the OR of all the bits of the final remainder of the significand square root calculation. No additional hardware is required. each bit yn is calculated using [9] ⎧1 if ( X n − Tn ) ≥ 0 ⎪ yn = ⎨ ⎪0 otherwise ⎩ ⎧2( X n − Tn ) if y n = 1 2 ⎪ . SR is transferred to the divisor register. Square Root The square root function is much less frequent than the previous operations.… 2 From (2) it can be seen that the adopted square root calculation algorithm is quite similar to the division method examined in the previous section. which will always be even. infinity or NaN according to a simple set of rules. After the significand square root calculation process is complete. and produces a 55-bit SR. starts in the second clock cycle. and a fast ripple-carry adder. its square root SR will be in the range [1.e. For now we can assume that A is positive and not zero. Since the design is not pipelined and has a fixed latency of 60 clock cycles. The main components of this part of the circuit are two registers for Xn and Tn. it will always be normalised. Thus.2. As for zero. Based on this algorithm a significand square root circuit calculates one bit of SR per clock cycle. In cycle 58. the operand A is unpacked. i. during the clock cycles 2 to 56. T1 = 0. the 56-bit SR is rounded to 53 bits using the significand divider adder. SA will be in the range [1. The biased exponent ER of the result is calculated directly from the biased exponent EA using [9] ⎧ E A + 1022 ⎪ 2 ⎪ ER = ⎨ ⎪ E A + 1023 ⎪ 2 ⎩ if E A is even (and left shift S A one place) 1 if E A is odd ER is calculated using a fast ripple carry adder. Figure 5 shows the organisation of this circuit. which is connected to the adder of the significand divider.5. The calculation of SA . while the division by 2 is just a matter of discarding the LSB of the numerator. the critical path comprised of 42. infinity or NaN. our floating-point square root circuit aims solely at a low implementation cost. According to (1). This circuit can operate at up to ~60MHz.8% and 57. a 56-bit SR for rounding is formulated. whereby the result R is appropriately set before packing. i. 0 . infinity and NaN operands. Table 1 shows the implementation statistics of the double precision floating-point divider. i. Thus.. it will require a left shift by just one place. of the square root of SA. Floating-point square root the significand SR. In the first cycle. the contents of register Tn form the significand SR for rounding. SR will be in the range (0. This normalisation is also performed in cycle 57. The Tn register has been implemented so that each flip-flop has its own enable signal. A Unpack A = Neg. y 2 y 3 K y n 01 X n +1 = ⎨ ⎪2 X if y n = 0 ⎩ n with X 1 = 6. If SR requires normalisation. the implementation considered here is small enough to allow multiple instances to be incorporated in a single FPGA device if needed. In cycle 57. i. the significand square root circuit operates for 55 cycles. SR is rounded to 53 bits using the .e. which allows each individual bit yn to be stored in the correct position in the register and also controls the reset and set inputs of the two next flips-flops in the register so that the correct Tn is formed for each cycle of the process. if not already normalised. The circuit is very small.2). which also includes 193 I/O synchronization registers. This exponent adjustment is performed using the same adder that is used for the exponent processing of the previous cycles.79% of the device.normalised. Consequently. the exponent ER is incremented by 1 in cycle 58. In cycle 59.4). Denoting SR as y1.2). the calculation of the square root R of the floating-point number A proceeds as follows. ∞ . Also in cycle 58. Tn +1 = y1 . occupying only 2.1 and n=1. A nonpipelined design is adopted with a fixed latency of 59 cycles. NaN sA EA SA Adjust SA Exponent Significand Calculation Square Root ER SR Rounding Control Exponent Adjust ER Pack R Round SR Figure 5.

. is fully pipelined and has a much higher peak throughput. 1991. In our case. Finally. simple rules apply with regards to the result R. 547 Paschalakis. 3. IEEE Symposium on FPGAs for Custom Computing Machines. “A Re-Evaluation of the Practicality of Floating Point Operations on FPGAs”. 1997. 195-203 Digital Core Design.H. 155-162 Loucas. The circuit can operate at up to ~80MHz. the critical path comprised of 53. pp. In Proc. pp. “IEEE Standard for Binary Floating-Point Arithmetic”. pp. University of Kent at Canterbury. and with some caution.. 1985 Goldberg. For a supernormal SR after rounding. vol. X. “Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FPGAs”. IEEE Symposium on FPGAs for Custom Computing Machines. UK. 11th IEEE Symposium on FieldProgrammable Custom Computing Machines. In Proc. In Proc. which also facilitates multiple instances of the desired operators into the same chip.. Greyscale and Colour Pattern Recognition”. In conclusion. 2001 8. e. S.same adder that is used by the significand square root circuit. T. 1995.xilinx. Athanas. which would eventually be absorbed by the surrounding hardware. S. no. K. Underwood. 2001. which includes 129 I/O synchronization registers. Clearly. for zero. From the definition of ER in (1) it is easy to see that an overflow or underflow will never occur. 1. McMillan. The circuit is very small.. are suitable for pipelining to increase their throughput. the double precision floating-point adder presented here occupies 2. 5th IEEE Symposium on Field Programmable Custom Computing Machines. Although non-pipelined circuits were considered here to achieve low circuit costs. however. As an indication. W. 9. 10. 226-232 Ligon. N.g. In Proc. there are significant speed and circuit size tradeoffs to consider when deciding on the range and precision of FPGA floating-point arithmetic circuits. Nelson. with a latency of three and ten cycles respectively. “Moment Methods and Hardware Architectures for High Speed Binary. F. Y. no actual normalisation is required but the exponent ER is incremented by 1 and this adjustment takes place in cycle 59 and is performed using the same adder that is used for the exponent processing during the first cycle.A. pp. K. P. Thesis. 206-215 Belanovic. Schoonover.g. Walters. In Proc. but also due to other circuit characteristics.. www.. References 1. pipelining the existing designs may not be the most efficient option and different designs and/or algorithms should be considered.0% logic and routing delays respectively..D.36 MFLOPS. Monn. approximately the same number of slices as the single precision floating-point adder in [7]. W.. Stivers.E. a high radix SRT algorithm for division [6]. Discussion We have presented low cost FPGA floating-point arithmetic circuits in the 64-bit double precision format and for all the common operations. M.. The implementation statistics of the operators show that they are very economical in relation to contemporary FPGAs. That circuit has a higher latency than the adder presented here. IEEE Symposium on FPGAs for Custom Computing Machines. A direct comparison with other floating-point unit implementations is very difficult to perform. A. “Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines”. ACM Computing Surveys. The adder of [7]. Since the circuit is not pipelined and has a fixed latency of 59 clock cycles.. G. this gives rise to a performance in the order of 1. Field Programmable Logic and Applications. Cook..com ANSI/IEEE Std 754-1985.. “Implementation of Single Precision Floating Point Square Root on FPGAs”.. 8. 6. the adder and multiplier. 1996. Table 1 shows the implementation statistics of our double precision floating-point square root function. The choice of floating-point format for any given problem ultimately rests with the designer. infinity. Department of Electronics. Alliance Core Data Sheets for Floating-Point Arithmetic. Ph.. all the circuits presented here incorporate I/O registers. Johnson.. W. pp. 1998. 23. For the divider and square root operators.D.pp. D. . “What Every Computer Scientist Should Know About Floating-Point Arithmetic”. 2002. 2003. 5. 7.82% of the device. which result in both circuits having approximately the same performance. but also a higher maximum clock speed. NaN or negative operands. these circuits were used in the implementation of a high-speed object recognition system which relies partly on custom parallel processing structures and partly on floating-point processing.. Such circuits can be extremely useful in the FPGA-based implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit.. B. occupying only 2. P. 107-116 Li. not only because of floating-point format differences.0% and 47. Chu. In Proc. pp. L. the circuits presented here provide an indication of the costs of FPGA floating-point operators using a long format. e.. Leeser.. Shirazi. 4. “Tradeoffs of Designing Floating-Point Division and Square Root on Virtex FPGAs”. “A Library of Parameterized Floating-Point Modules and Their Use”.. 657-666 Wang.B.