Professional Documents
Culture Documents
1
Floating Point
15900000000000000
14
could be represented as
2
Binary
The value of real binary numbers…
Scientific 22 21 20 . 2-1 2-2 2-3
Fractions . ½ ¼ ¾
Decimal 4 2 1 . .5 .25 .125
1 0 1 . 1 0 1
101.101 = 4+1+1/2+1/8
= 4+1+.5+.125= 5.625
=5 ⅝ 3
Binary Fractions
The value of real binary numbers…
Scientific 22 21 20 . 2-1 2-2 2-3
Fractions . ½ ¼ ⅛
Decimal 4 2 1 . .5 .25 .125
1 0 1 . 1 0 1
101.101 = 4+1+1/2+1/8
= 4+1+.5+.125= 5.625
=5 ⅝ 4
Binary Fractions
The value of real binary numbers…
Scientific 22 21 20 . 2-1 2-2 2-3
Fractions . ½ ¼ ⅛
Decimal 4 2 1 . .5 .25 .125
1 0 1 . 1 0 1
101.101 = 4+1+1/2+1/8
= 4+1+.5+.125= 5.625
=5 ⅝ 5
IEEE Single Precision
The number will occupy 32 bits
7
IEEE – Example 1
Convert 6.75 to 32 bit IEEE format.
1. The Mantissa. The Integer first.
6/2 =3r0
3/2 =1r1 = 1102
1/2 =0r1
2. Fraction next.
.75 * 2 = 1.5 = 0.112
.5 * 2 = 1.0
3. put the two parts together… 110.11
Now normalise 1.1011 * 22
8
IEEE – Example 1
Convert 6.75 to 32 bit IEEE format.
1. The Mantissa. The Integer first.
6/2 =3r0
3/2 =1r1 = 1102
1/2 =0r1
2. Fraction next.
.75 * 2 = 1.5 = 0.112
.5 * 2 = 1.0
3. put the two parts together… 110.11
Now normalise 1.1011 * 22
9
IEEE – Example 1
Convert 6.75 to 32 bit IEEE format.
1. The Mantissa. The Integer first.
6/2 =3r0
3/2 =1r1 = 1102
1/2 =0r1
2. Fraction next.
.75 * 2 = 1.5 = 0.112
.5 * 2 = 1.0
3. put the two parts together… 110.11
Now normalise 1.1011 * 22
10
IEEE Biased 127 Exponent
To generate a biased 127 exponent
Take the value of the signed exponent and add 127.
Example.
11
Possible Representations of
an Exponent
Binary Sign Magnitude 2's Biased
Complement 127
Exponent.
00000000 0 0 -127
{reserved}
00000001 1 1 -126
00000010 2 2 -125
01111110 126 126 -1
01111111 127 127 0
10000000 -0 -128 1
10000001 -1 -127 2
11111110 -126 -2 127
11111111 -127 -1 128
{reserved}
12
Why Biased ?
13
Back to the example
Our original example revisited…. 1.1011 * 22
Exponent is 2+127 =129 or 10000001 in binary.
14
Special cases
0 + Infinity and - infinity.
Zero is a pattern that only contains ‘0’s
00000000000000000000000000000000
Positive Infinity is the pattern
011111111….
Negative Infinity is the pattern
111111111….
15
Truncation and Rounding
16
Rounding
If lost digit is > ½ then add 1 to LSB
Example – in 4 bits
0.1101101 <- 0.1101 + 0.0001 = 0.1110 ( rounded UP)
0.1101011 <- 0.1101 ( rounded DOWN)
NOTE:
Rounding is always preferred to truncation partly because it
is intrinsically more accurate , and because we end up with a
FAIR error .
17
Other Considerations
18
From Floating Point Binary
to Decimal Example
1 01111011 11100000100000000000000
Sign = 1 therefore this number is a negative number.
20
Example
Example 1.1* 23 + 1.1 * 22
Select the smaller number and make the mantissa smaller by
moving the point whilst increasing the exponent until the
exponents match.
1.1 * 22 0.11 * 23
21
Example
1.1* 23 001.1 23
+1.1 * 22 000.11 23
010.01 23
Re normalise 010.01 * 23
= 1.001 * 24
22
FP math
23