Professional Documents
Culture Documents
floating points
Floating Point
B. (x&7)!=7||(x<<29<0)
C. (x*x)>=0
D. x<0||-x<=0
E. x>0||-x>=0
2
Real Numbers
𝑑 = 10𝑖 × 𝑑𝑖
𝑖=−𝑛 3
Binary Representation
𝑏 = 2𝑖 × 𝑏𝑖
𝑖=−𝑛 4
Examples
Fractional Binary Decimal
value representation representation
1/8 0 . 0 0 1 0.125
25/16 1 . 1 0 0 1 1.5625
43/16? 1 0 . 1 0 1 1 2.6875?
9/8? 1 ?. 0 0 1 1.125
1/3 0 ?. ( 0 1 ) 0.(3)?
1/10 𝑚 0.1
𝑏 = 2𝑖 × 𝑏𝑖
𝑖=−𝑛 5
Precision
8 bits 0.0001100
16 bits 0.000110011001100
24 bits 0.00011001100110011001100
32 bits 0.0001100110011001100110011001100
6
The Patriot Missile Failure
28
7
Fixed-point representation
n bits
000011001100110011001100
0.00011001100110011001100
8
Floating-point representation
(−1)𝑆 × 𝑀 × 2𝐸
1001.1101 → 1.0011101×23
0.011101 → 1.1101×2-2
-0.011101 → -1×1.1101×2-2
(−1)𝑆 × 𝑀 × 2𝐸
exp = E + bias
bias = 2k-1-1
Implied
leading 1
s exp frac
10
Precision options
Single precision: 32 bits
s exp frac
1 8 bits 23 bits
100011002 1.110110110110100000000002
exp = 13
E ++bias
127 1.11011011011012
1.11011011011012 × 213
15213.010 111011011011012
12
Example 2
Single precision: 32 bits
s1 01111110
exp 01000000000000000000000
frac
011111102 1. 010000000000000000000002
-1.012 × 2-1
-0.62510 -0.1012
13
Denormalized Values
0.0 ??and +0.0
-0.0
s 000…00 all zeros
14
Special Values
-∞ and +∞
s 1 1 1 …1 1 all zeros
15
Summary
• Fixed-point representation
• Floating-point representation
16
James Gosling
Java Inventor
17