This action might not be possible to undo. Are you sure you want to continue?

2013

Tutorial: Floating-Point Binary

**Tutorial: Floating-Point Binary
**

The two most common floating-point binary storage formats used by Intel processors were created for Intel and later standardized by the IEEE organization: IEEE Short Real: 32 bits IEEE Long Real: 64 bits 1 bit for the sign, 8 bits for the exponent, and 23 bits for the mantissa. Also called single precision. 1 bit for the sign, 11 bits for the exponent, and 52 bits for the mantissa. Also called double precision.

Both form ats use e sse ntially the sam e m e thod for storing floating-point binary num be rs, so we will use the Short R e al as an e x am ple in this tutorial. The bits in an IEEE Short R e al are arrange d as follows, with the m ost significant bit (MSB) on the le ft:

1

Fig.

The Sign

The sign of a binary floating-point num be r is re pre se nte d by a single bit. A 1 bit indicate s a ne gative num be r, and a 0 bit indicate s a positive num be r.

The Mantissa

It is use ful to conside r the way de cim al floating-point num be rs re pre se nt the ir m antissa. Using -3.154 x 10 5 as an e x am ple , the sign is ne gative , the mantissa is 3.154, and the exponent is 5. The fractional portion of the m antissa is the sum of e ach digit m ultiplie d by a powe r of 10: .154 = 1/10 + 5/100 + 4/1000 A binary floating-point num be r is sim ilar. For e x am ple , in the num be r +11.1011 x 2 3, the sign is positive , the m antissa is 11.1011, and the e x pone nt is 3. The fractional portion of the m antissa is the sum of succe ssive powe rs of 2. In our e x am ple , it is e x pre sse d as: .1011 = 1/2 + 0/4 + 1/8 + 1/16 O r, you can calculate this value as 1011 divide d by 2 4. In de cim al te rm s, this is e le ve n divide d by six te e n, or 0.6875. C om bine d with the le ft-hand side of 11.1011, the de cim al value of the num be r is 3.6875. He re are additional e x am ple s: Binary Floating-Point 11.11 0.00000000000000000000001 Base 10 Fraction 3 3/4 1/8388608 3.75 0.00000011920928955078125 Base 10 Decimal

The last e ntry in this table shows the sm alle st fraction that can be store d in a 23-bit m antissa. The following table shows a fe w sim ple e x am ple s of binary floating-point num be rs alongside the ir e quivale nt de cim al fractions and de cim al value s: Binary .1 .01 .001 .0001 .00001 Decimal Fraction 1/2 1/4 1/8 1/16 1/32 Decimal Value .5 .25 .125 .0625 .03125

kipirvine.com/asm/workbook/floating_tut.htm

1/4

0 Normalized A s 1. the n adjuste d. The large st possible e x pone nt is 128-.com/asm/workbook/floating_tut.101 x 2 5 as an e x am ple .2013 Tutorial: Floating-Point Binary The Exponent IEEE Short R e al e x pone nts are store d as 8-bit unsigne d inte ge rs with a bias of 127.22.11 +1101.101 . Exponent. its m antissa m ust be norm alize d.101 is norm alize d as 1. Creating the IEEE Bit Representation W e can now com bine the sign. The proce ss is basically the sam e as whe n norm alizing a floating-point de cim al num be r.234567 x 10 3 by m oving the de cim al point so that only one digit appe ars be fore the de cim al.0001 10000011.101101 x 2 3 by m oving the de cim al point 3 positions to the le ft.0000011 Exponent 3 -3 0 7 You m ay have notice d that in a norm alize d m antissa.101 -.101 x 2 0 is store d as sign = 0 (positive ). For e x am ple . The e x pone nt e x pre sse s the num be r of positions the de cim al point was m ove d le ft (positive e x pone nt) or m ove d right (ne gative e x pone nt). m antissa = 101.09. it produce s 255.whe n adde d to 127. Using Figure 1 as a re fe re nce .0 x 2 -127 to 1. the value 1. In fact.0000001101011 Biased Exponent 127 130 124 132 120 Sign.567 is norm alize d as 1.htm 2/4 . and e x pone nt = 01111111 (the e x pone nt value is adde d to 127). and the re fore cannot be ne gative . and finally in unsigne d binary: A djusted (E + 127) 132 127 117 255 0 126 Exponent (E) +5 0 -10 +128 -127 -1 Binary 10000100 01111111 01110101 11111111 00000000 01111110 The binary e x pone nt is unsigne d.00101 +100111.101101 1. The approx im ate range is from 1. the le ading 1 is om itte d from the m antissa in the IEEE storage form at be cause it is re dundant.0001 1. Normalizing the Mantissa Be fore a floating-point binary num be r can be store d corre ctly. the digit 1 always appe ars to the le ft of the de cim al point.0 x 2 +128. e x pone nt. the large st unsigne d value re pre se nte d by 8 bits. He re are som e e x am ple s of norm alizations: Binary Value 1101. first shown in de cim al. and m ultiplying by 2 3. Mantissa 1 01111111 11000000000000000000000 0 10000010 10110100000000000000000 1 01111100 01000000000000000000000 0 10000100 00111000000000000000000 0 01111000 10101100000000000000000 kipirvine. the floating-point binary value 1101. de cim al 1234. The "1" to the le ft of the de cim al point is droppe d from the m antissa.0 +. He re are m ore e x am ple s: Binary Value -1. He re are som e e x am ple s of e x pone nts. Le t's use the num be r 1.00101 1.01 1. The e x pone nt (5) is adde d to 127 and the sum (132) is binary 10000100. and norm alize d m antissa into the binary IEEE short re al re pre se ntation. Sim ilarly.

22.. it is fairly e asy to discove r the corre sponding binary re al. 0 0 7 8 1 2 5 0 0 0 0 0 r e m a i n d e r=0 . 1 2 5 0 0 0 0 0 0 0 0 0 r e m a i n d e r=0 .2 and shows e ach re m ainde r.. 0 7 5 0 0 0 0 0 0 0 0 0 s u b t r a c t i n g0 . 0 0 0 2 4 4 1 4 0 6 2 5 r e m a i n d e r=0 . 0 0 0 2 9 2 9 6 8 7 5 0 s u b t r a c t i n g0 . 0 0 0 0 1 5 2 5 8 7 8 9 r e m a i n d e r=0 .0001 . 1/2 1/4 1/2 + 1/4 1/8 1/2 + 1/4 + 1/8 1/4 + 1/8 1/16 1/8 + 1/16 1/4 + 1/16 Binary Real . 0 0 0 0 3 0 5 1 7 5 7 8 r e m a i n d e r=0 .01 . The re sult. The blank line s are for fractions that we re too large to be subtracte d from the re m aining value of the num be r. howe ve r.11 . In fact.011 . 2 0 0 0 0 0 0 0 0 0 0 0 s u b t r a c t i n g0 . which could not be subtracte d from 0. A fraction such as 1/5 (0.0101 O f course .2013 Tutorial: Floating-Point Binary Converting Decimal Fractions to Binary Reals If a de cim al fraction can be e asily re pre se nte d as a sum of fractions in the form (1/2 + 1/4 + 1/8 + . 0 0 0 0 1 8 3 1 0 5 4 7 s u b t r a c t i n g0 . an e x act value is not found afte r cre ating the 23 m antissa bits.. ). 0 0 0 0 0 0 9 5 3 6 7 4 3/4 kipirvine.. 0 0 0 0 0 1 1 4 4 4 0 9 s u b t r a c t i n g0 . was e qual to . He re is the output from a program that subtracts e ach succe sive fraction from 0.09.0011 . the re al world is ne ve r so sim ple .1 .2) m ust be re pre se nte d by a sum of fractions whose de nom inators are powe rs of 2. 0 0 0 0 4 8 8 2 8 1 2 5 s u b t r a c t i n g0 . 0 0 3 9 0 6 2 5 0 0 0 0 r e m a i n d e r=0 .001 .com/asm/workbook/floating_tut. 0 0 0 0 0 3 0 5 1 7 5 8 1 6 1 7 1 8 1 9 2 0 s u b t r a c t i n g0 . He re are a fe w sim ple e x am ple s Decimal Fraction 1/2 1/4 3/4 1/8 7/8 3/8 1/16 3/16 5/16 Factored A s.111 . Bit 1.2. is accurate to 7 digits. 0 1 2 5 0 0 0 0 0 0 0 0 s u b t r a c t i n g0 . 0 0 0 7 8 1 2 5 0 0 0 0 s u b t r a c t i n g0 . for e x am ple . 0 0 0 0 0 1 9 0 7 3 4 9 r e m a i n d e r=0 . 0 0 4 6 8 7 5 0 0 0 0 0 s u b t r a c t i n g0 . 0 0 0 4 8 8 2 8 1 2 5 0 r e m a i n d e r=0 .htm .5 (1/2). s t a r t i n g : 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 0 . 0 6 2 5 0 0 0 0 0 0 0 0 r e m a i n d e r=0 .

com/asm/workbook/floating_tut. 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 kipirvine.22.2013 Tutorial: Floating-Point Binary r e m a i n d e r=0 . 0 0 0 0 0 0 1 9 0 7 3 5 2 1 2 2 2 3 s u b t r a c t i n g0 . 0 0 0 0 0 0 0 7 1 5 2 6 M a n t i s s a :.09. 0 0 0 0 0 0 1 1 9 2 0 9 r e m a i n d e r=0 .htm 4/4 .

Sign up to vote on this title

UsefulNot usefulFloating-Point Binary

Floating-Point Binary

- floating-point-6up
- Floating Point
- Floating Point Representation
- Chapter 3 - Exercies
- CF_Week 7
- ELE107_ELE107-Week2-2014.pdf
- Lecture 1
- Atari ST Machine Language
- A_RevQ01
- Dd Notes Section1
- L3 FP Representation
- T1~2015.doc
- Grade 7 Fraction Test
- Topic 3 (Decimal)-Y4 09.ppt
- Data Representation
- ASA.General_Capítulo.2.pdf
- NO 2
- Lecture05b-Introduction to Floating Point Arithmetic
- CST PPT
- Digital Electronics
- 5.Properties of Multiplication and Division
- Lect4 Base Conversion
- Airframe & Powerplant
- 93076864 Chapter 1 Directed Numbers 1
- Fractions.Text.Marked.pdf
- Fractions.Text.Marked.pdf
- DMA 020 Module Review, Fall 2013
- ESF Number S&S August 2013
- Stevin, Simon.pdf
- 4th Grade Competition1
- Floating-Point Binary