You are on page 1of 12

Unit-2

Floating Point Numbers


• Fixed Point Numbers
• Binary point at the right end of the number
• Integer
• Range is 0 to using 32 bits
• Binary point is just to the right of the sign bit
• Fraction
• Range is  4.55 x 10-10 to 1
• Fixed point Numbers cannot represent very large numbers
and very small numbers, which is made possible by
representing the numbers as floating point.
• Because the position of the binary point in a floating point
number is variable, it must be given explicitly in the
floating point representation.
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
IEEE Standard for Floating point numbers


• General form: X1.X2X3X4X5X6X7 x 10 Y1Y2

• Xi and Yi are decimal digits.


• 7 significand digits and the exponent range ( 99) are sufficient
for a wide range of scientific calculations.
• A 23-bit mantissa can approximately represent a 7-digit decimal
number.
• 8-bits are needed to represent exponent.
• One sign bit.
• Since, the leading nonzero bit of a normalized binary mantissa
must be a 1, it does not have to be included explicitly in the
representation.
• Hence, a total of 32 bits is needed.

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


Normal Form
• There are many different floating point
number representations of the same
number
• Need for a unified representation in a given
computer
• The most significant position of the mantissa
contains a non-zero digit

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


IEEE Standard Floating point format
• Single Precision

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


IEEE standard floating point formats
• Double Precision

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


Floating Point Numbers
• 8-bit exponent -> -127 to +128
• Biased exponent -> 0 to 255.
• The end values of the range 0 and 255 are used to
represent special values.
• Therefore, actual exponent range: -126 to 127
• If required exponent is less than -126 -> underflow
• If required exponent is greater than +127 ->
overflow

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


Floating Point Numbers
• When both exponent and mantissa are 0 -> value is 0
• When exponent is 255 and mantissa is 0 -> value is ∞
(divide by 0)
• When exponent is 0 and mantissa ≠ 0 -> denormal
numbers are represented
• Therefore, they are smaller than the smallest normal
number.
• When exponent is 255 and mantissa ≠ 0, the value is
NaN (Not a Number)
• NaN is the result of performing an invalid operation
such as 0/0 or √(-1)
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Floating Point Numbers - IEEE format - Examples

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


Floating Point Numbers
Represent -12.625 in single precision IEEE-754 format.
• Step #1: Convert to target base. -12.625 = -1100.1012
• Step #2: Normalize. -1100.1012 = -1.1001012 × 23
• Step #3: Fill in bit fields. Sign is negative, so sign bit is 1.
Exponent is
in excess 127 (not excess 128!), so exponent is represented
as the
unsigned integer 3 + 127 = 130. Leading 1 of significant is
hidden, so
final bit pattern is:
1 1000 0010 . 1001 0100 0000 0000 0000 000
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Floating Point Numbers
• Represent -12.625 in double precision IEEE-754
format.
• Represent 178.1875 and 0.0625 in single
precision and double precision IEEE-754 format.
• . Determine the value of 1 10101011
11100101000000000000000 in decimal.
• Determine the value of 0 11101011010
11101010100000000000000000000000000000
00000000000000

Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University


Dr. J.Saira Banu, Associate Professor, SCOPE, VIT University

You might also like