You are on page 1of 19

Fixed-point and

floating-point numbers
CS370
Fall 2003

Representations of numbers
Unsigned integers
Signed integers 1s and 2s
complement representation
To represent
Very Large and very Small numbers
Real numbers in general

Fixed-point numbers
Floating-point numbers
2

Base-10 (decimal) arithmetic


Uses the ten numbers from 0 to 9
Each column represents a power of 10
Thousands (103) column
Hundreds (102) column
Tens (101) column
Ones (100) column

1999.10

= 1x103 + 9x102 + 9x101 + 9x100


3

Base-10 (decimal) arithmetic


Uses the ten numbers from 0 to 9
Each column represents a power of 10
Tens (101) column
Ones (100) column
Tenths (10-1) column
Hundredths (10-2) column

19.9910

= 1x101 + 9x100 + 9x10-1 + 9x10-2


4

Standard binary representation


Uses the two numbers from 0 to 1
Every column represents a power of 2
Eights (23) column
Fours (22) column
Twos (21) column
Ones (20) column

1001.2

= 1x23 + 0x22 + 0x21 + 1x20


5

Fixed-point representation
Uses the two numbers from 0 to 1
Every column represents a power of 2
Twos (21) column
Ones (20) column
Halves (2-1) column
Fourths (2-2) column

10.012

= 1x21 + 0x20 + 0x2-1 + 1x2-2


6

Addition
Base-10

1.
1.
2.

2
5
7

Base-2

5
0
5

+
1

1.
1.
0.

0
1
1

1
0
1

Range of values in a byte


Lowest
exponent
0
-1
-2
-4

Min

Step

0
1
0
.5
0 .25
0 .0625

Max

Value of
00110001

255
127.5
63.75
15.9375

Scientific notation (1)


One billion
= 1,000,000,000
= 1 x 109
significand or mantissa: 1
base or radix: 10
exponent: 9

Scientific notation (2)


1999
= 1.999 x 103
significand or mantissa: 1999
base or radix: 10
exponent: 3

= 19.99 x 10
= 199.9 x 10
10

Practice (base 10)


258 = 2.58 x 102
Mantissa = 258
Radix = 10
Exponent = 2

24.25 = 2.425 x 101


Mantissa = 2425
Radix = 10
Exponent = 1
11

Base-2 scientific notation


2.25ten
= 10.01two
= 10.01two x 20
= 1.001two x 21 normalized
Numbers are usually normalized which
means that the leading bit is always a 1.

12

8-bit floating point format (1)


sign
1 bit

exponent significand number


3 bits
base 2
4 bits

number
base 10

001

1001

1.001x21 2.25

011

1100

1.1 x 23

111

1110

1.11 x 27 224.0

001

1110

1.11 x 2-1 0.875

12.0

13

Improvements
Bias the exponent
Always subtract a fixed amount, e.g., 3
Allows representation of negative
exponents

Implicit one
- Leading one in a Phone number such as
1-619-556-0231 is redundant.
Why use a bit for the leading one?
14

8-bit floating-point format (2)


Exponent (3 bits) is biased by 3
The leading one of significand is implicit
Zero is represented by all zeros
sign exponent 3
1 bit
bits
0
100
0
011
0
111
1
001

significand
4 bits
0010
1000
1100
1100

number base number base


2
10
1.001x21
2.25
1.1 x 23
12.0
1.11 x 27
224.0
1.11 x 2-1
0.875
15

IEEE standard floating-point


Single precision
32 bits
sign: 1 bit
exponent: 8 bits
significand: 23 bits

Bias: 127

Double precision
64 bits
sign: 1 bit
exponent: 11 bits
significand: 52 bits

Bias: 511

16

Practice( base 10)


13 = 1.3 x 101
= 1.011 x 23

1.25 = 1.25 x 100


= 1.010 x 20

17

18

exponent

mantissa

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

exponent

mantissa

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

19

You might also like