The document discusses floating point numbers and the IEEE 754 standard for floating point representation. It describes:
1) Fixed point and floating point numbers, with fixed point having limited range while floating point can represent very large and small numbers.
2) The general form of the IEEE 754 floating point standard which represents numbers as mantissa and exponent in the form of X.X x 10^Y.
3) Single and double precision IEEE 754 formats which use 32 and 64 bits respectively, with bits divided among sign, exponent and mantissa fields.
Original Description:
Original Title
13-2-Floating Point Representation With IEEE Standards-18!08!2021 [18-Aug-2021]Material_I_18!08!2021_lecture6
The document discusses floating point numbers and the IEEE 754 standard for floating point representation. It describes:
1) Fixed point and floating point numbers, with fixed point having limited range while floating point can represent very large and small numbers.
2) The general form of the IEEE 754 floating point standard which represents numbers as mantissa and exponent in the form of X.X x 10^Y.
3) Single and double precision IEEE 754 formats which use 32 and 64 bits respectively, with bits divided among sign, exponent and mantissa fields.
The document discusses floating point numbers and the IEEE 754 standard for floating point representation. It describes:
1) Fixed point and floating point numbers, with fixed point having limited range while floating point can represent very large and small numbers.
2) The general form of the IEEE 754 floating point standard which represents numbers as mantissa and exponent in the form of X.X x 10^Y.
3) Single and double precision IEEE 754 formats which use 32 and 64 bits respectively, with bits divided among sign, exponent and mantissa fields.
• Fixed Point Numbers • Binary point at the right end of the number • Integer • Range is 0 to using 32 bits • Binary point is just to the right of the sign bit • Fraction • Range is 4.55 x 10-10 to 1 • Fixed point Numbers cannot represent very large numbers and very small numbers, which is made possible by representing the numbers as floating point. • Because the position of the binary point in a floating point number is variable, it must be given explicitly in the floating point representation. Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University IEEE Standard for Floating point numbers • General form: X1.X2X3X4X5X6X7 x 10 Y1Y2
• Xi and Yi are decimal digits.
• 7 significand digits and the exponent range ( 99) are sufficient for a wide range of scientific calculations. • A 23-bit mantissa can approximately represent a 7-digit decimal number. • 8-bits are needed to represent exponent. • One sign bit. • Since, the leading nonzero bit of a normalized binary mantissa must be a 1, it does not have to be included explicitly in the representation. • Hence, a total of 32 bits is needed.
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Normal Form • There are many different floating point number representations of the same number • Need for a unified representation in a given computer • The most significant position of the mantissa contains a non-zero digit
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
IEEE Standard Floating point format • Single Precision
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
IEEE standard floating point formats • Double Precision
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Floating Point Numbers • 8-bit exponent -> -127 to +128 • Biased exponent -> 0 to 255. • The end values of the range 0 and 255 are used to represent special values. • Therefore, actual exponent range: -126 to 127 • If required exponent is less than -126 -> underflow • If required exponent is greater than +127 -> overflow
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Floating Point Numbers • When both exponent and mantissa are 0 -> value is 0 • When exponent is 255 and mantissa is 0 -> value is ∞ (divide by 0) • When exponent is 0 and mantissa ≠ 0 -> denormal numbers are represented • Therefore, they are smaller than the smallest normal number. • When exponent is 255 and mantissa ≠ 0, the value is NaN (Not a Number) • NaN is the result of performing an invalid operation such as 0/0 or √(-1) Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University Floating Point Numbers - IEEE format - Examples
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Floating Point Numbers Represent -12.625 in single precision IEEE-754 format. • Step #1: Convert to target base. -12.625 = -1100.1012 • Step #2: Normalize. -1100.1012 = -1.1001012 × 23 • Step #3: Fill in bit fields. Sign is negative, so sign bit is 1. Exponent is in excess 127 (not excess 128!), so exponent is represented as the unsigned integer 3 + 127 = 130. Leading 1 of significant is hidden, so final bit pattern is: 1 1000 0010 . 1001 0100 0000 0000 0000 000 Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University Floating Point Numbers • Represent -12.625 in double precision IEEE-754 format. • Represent 178.1875 and 0.0625 in single precision and double precision IEEE-754 format. • . Determine the value of 1 10101011 11100101000000000000000 in decimal. • Determine the value of 0 11101011010 11101010100000000000000000000000000000 00000000000000
Dr.J.Saira Banu, Associate Professor, SCOPE, VIT University
Dr. J.Saira Banu, Associate Professor, SCOPE, VIT University