FLOATING POINT
IEEE 754
Floating Point Operations
Principal of Floating Point
In the scientific notation n = f 10e where f is called the fraction, or mantissa, or significand and e is a positive or negative integer called the exponent. he comp!ter version of this notation is called floating point. Examples
".14 = 0."14 101 = ".1410 100 0.000001 = 0.1 10#5 = 1.0 10#$ 1%41 = 0.1%41 104 = 1.%41 10"
he range is effectivel& determined '& the n!m'er of digits in the exponent. he precision is determined '& the n!m'er of digits in the fraction. (ore 'its for significand gives more acc!rac& (ore 'its for exponent increases range
)or *inar&
+,1-sign x significand x .exponent
Exponential Notation
The following are eq i!alent representations of "#$%&
123,400.0 12,340.0 1,234.0 123.4 12.34 1.234 0.1234 x x x x x x x 10-2 10-1 100 101 102 103 104
The representations differ in that the decimal place the point -- floats to the left or right (with the appropriate adjustment in the exponent).
Parts of a Floating Point N '(er
-0.9876 x 10-3
Sign of mantissa Location of decimal point
!xponent
antissa
Sign of exponent "ase
Nor'ali)ation
/ floating point n!m'er is said to 'e normali0ed if the most significant digit of the mantissa is non0ero )or example the decimal n!m'er "50 is normali0ed '!t 000"5 is not. 1egardless of where the position of the radix point is ass!med to 'e in the mantissa, the n!m'er is normali0ed onl& if its leftmost digit is non0ero.
Example2
(2.0) x 10-9 3ormali0ed (0.2) x 10-8 3ot4normali0ed (20.0) x 10-10 3ot4normali0ed
Floating*Point for +inar, N '(ers
)loating point2
5omp!ter arithmetic that represents n!m'ers in which 'inar& point is not fixed
Example2 +1.0- x . 41
6pposite of fixed4point notation2
Example2 1.".45$
5omp!ters s!pport floating4point arithmetic he fractional point is called the 7'inar& point8
+inar, Nor'ali)e- For'
9h& 3ormali0ed form:
;implifies exchange of data ;implifies floating4point algorithms Increase acc!rac& of n!m'ers
)ormat2 (1.xxxxxx) x 2yyyyy
<ow can we convert to normali0ed form
need a 'ase that can exactl& 'e decreased or increased '& the n!m'er of 'its to 'e shifted
3ormali0ed n!m'ers are generall& prefera'le to !nnormali0ed n!m'ers, 'eca!se there is onl& one normali0ed form, whereas there are man& !nnormali0ed forms.
IEEE ./& 0tan-ar Most co''on stan-ar- for representing floating point n '(ers 0ingle precision1 %$ (its# consisting of222 0ign (it 3" (it4 Exponent 35 (its4 Mantissa 3$% (its4 6o (le precision1 7& (its# consisting of8 0ign (it 3" (it4 Exponent 3"" (its4 Mantissa 3/$ (its4
IEEE FP*./& 0tan-ar =reatl& improved porta'ilit& and >!alit& of comp!ter arithmetic (a?es the leading 1 'it of normali0ed 'inar& n!m'ers implicit
=@ 3!m'ers are expanded '& 1 'it
3!m'ers .4 'its long for single precision+1 implied A ." fraction 3!m'ers are 5" 'its long for do!'le4precision +1 implied A 5. fraction-
Bero is represented as 00 C 00two <as a s&m'ol +NaN = 3ot a n!m'er- for invalid operations +e.g. 0D0 or s!'tracting infinit& from infinit& /llows programmers to postpone some tests and decisions to a later time in the program
/ll other n!m'ers are represented !sing the following form!la2 +41-s x +1 A )raction- x .E
IEEE FP*./& 0tan-ar *iased notation
1epresent the most negative exponent as 000 C 00 two IEEE 754 !ses a 'ias of 1.7 for single precision
)orm!la for 'iased notation
+41-s x +1 A )raction- x .+Exponent , *ias-
Examples2
En'iased representation2
41 will 'e represented as +41 A 1.7- = 1.$ten = 0111 1110two 40.75 ten = 4 0.11two = 4 1.1two x .41
*iased single precision representation
)or 40.75 ten
+41-1 x + 1A .1000 0000 0000 0000 0000 000two- x .+1.$41.7-
9hat will change for do!'le precision:
% "
% 9
$ :
$ 5
$ .
$ 7
$ /
$ &
$ %
$ $
$ "
$ 9
" :
" 5
" .
" 7
" /
" &
" %
" $
" "
" 9
&
"
"
"
"
"
" " " 9
"
9 9
9 9
9 9
9 9
0ingle Precision For'at
#$ %its
antissa ($# %its) !xponent (& %its) Sign of mantissa (' %it)
0ingle Precision Floating Point 3FP4 N '(ers
5!rrent comp!ter s&stems dictate that )F n!m'ers m!st fit in ".4 or $44'it registers. ".4'it or single precision )F n!m'ers are organi0ed as follows2
+41-s x )raction x .E seee eeee emmm mmmm mmmm mmmm mmmm mmmm
where s is the sign of the n!m'er, e represents the 'iased exponent +G 'its- and m represents the mantissa or significand +." 'its ".4'it val!es range in magnit!de from 104"G to 10"G.
%" %9 2 2 2 $% s exponent $$ 222 9
0ignifican- 3Mantissa;Fraction4
6o (le Precision Floating Point N '(ers
$4 'it do!'le precision floating point n!m'ers +Hal!e represented in two ".4'it words- are organi0ed as follows2
he (;* is the sign 'it he next 11 'its are the exponent he remaining .0A".=5. 'its are the significand
he range in magnit!de is from 104"0G to 10"0G he growth of significand and exponent is a compromise 'etween acc!rac& and range.
%" %9 2 2 2 $9 ": s exponent 222 9
0ignifican- 3Mantissa;Fraction4
0ignifican- 3Mantissa;Fraction4 <ontin e-
Nor'ali)ation
The 'antissa is normalized =as an i'plie- -eci'al place on left =as an i'plie- >"? on left of the -eci'al place E2g2#
Mantissa @epresents8
10100000000000000000000 1.1012 = 1.62510
+iase- Exponent
Exponents can 'e 'oth positive ad negative giving rise to a need of sign 'it in exponents.
eg. Exponents ranging from 50 to 49 need 2 digits for t e !a"#e and one $it for t e sign.
*iased Exponent eliminates the need for sign '& adding a positive >!antit& to the exponent so that it is alwa&s positive
%dding 50 to o#r examp"e exponent ma&es t e range as 0 to 99' !a"#e re(#iring 2 digits and no sign $it needed.
Excess Notation
To incl -e A!e an- B!e exponents# >excess? notation is se 0ingle precision1 excess "$. 6o (le precision1 excess "9$% The !al e of the exponent store- is larger than the act al exponent E2g2# excess "$.#10000111
Exponent @epresents8
135 127 = 8
IEEE FP*./& 0tan-ar Exa'ple1
<on!erting the following (inar, representation into -eci'al floating point
% "
% 9
$ :
$ 5
$ .
$ 7
$ /
$ &
$ %
$ $
$ "
$ 9
" :
" 5
" .
" 7
" /
" &
" %
" $
" "
" 9
&
"
"
" 9
9 "
"
9 9
9 9
9 9
9 9
3*"4s x 3" A Fraction4 x $3Exponent B +ias4 C 3*"4" x 3" A 92$/4 x $3"$: B "$.4 C *" x "2$/ x $$
C *"2$/ x & C */29
Exa'ple
0ingle precision
0 10000010 11000000000000000000000 '.''$ '#( '$) * # ( * positi+e mantissa
,'.''$ x $# * '''(.($ * '-.('(
=exa-eci'al
It is con!enient an- co''on to represent the original floating point n '(er in hexa-eci'al The prece-ing exa'ple8
0 10000010 11000000000000000000000
0 0 0
<on!erting fro' Floating Point
E2g2# Dhat -eci'al !al e is represente(, the following %$*(it floating point n '(erE
C17B000016
0tep "
Express in (inar, an- fin- 0# E# an- M
C17B000016 = 1 10000010 111101100000000000000002 S !
' * negati+e ( * positi+e
0tep $
Fin- >real? exponent# n n C E B "$. C "99999"9$ B "$. C "%9 B "$. C%
0tep %
P t 0# M# an- n together to for' (inar, res lt 36onFt forget the i'plie- >"2? on the left of the 'antissa24
-1.11110112 x 2n = -1.11110112 x 23 = -1111.10112
0tep &
Express res lt in -eci'al
-1111.10112 -'. $-' * (.. $-# * (.'$. $-- * (.(/$. (./&).
0nswer1 -'../&).
<on!erting to Floating Point
E2g2# Express %72/7$/"9 as a %$*(it floating point n '(er 3in hexa-eci'al4
0tep "
Express original !al e in (inar,
36.562510 = 100100.10012
0tep $
Nor'ali)e
100100.10012 = 1.0010010012 x 25
0tep %
6eter'ine 0# E# an- M
+1.0010010012 x 25 n S
! * n , '$) * . , '$) * '#$ * 100001002
S * ( (%ecause the +alue is positi+e)
0tep &
P t 0# E# an- M together to for' %$*(it (inar, res lt
0 10000100 001001001000000000000002 S !
0tep /
Express in hexa-eci'al
0 10000100 001001001000000000000002 = 0100 0010 0001 0010 0100 0000 0000 00002 = 4 2 1 2 4 0 0 016
0nswer1 -$'$-((('/
0pecial N '(ers in IEEE Floating Point
An all )ero n '(er is a nor'ali)e- 9 Other n '(ers with (iase- exponent e C 9 are calle- -enor'ali)e 6enor' n '(ers ha!e a hi--en (it of 9 an- an exponent of *"$7G the, 'a, ha!e lea-ing 9s N '(ers with (iase- exponent of $// are sefor H an- other special !al es# calle- NaN 3not a n '(er4 For exa'ple# one NaN represents 9;9
IEEE 0tan-ar- 6o (le Precision Floating Point
sig n s 9 " exponent L "" "$ f r ac t io n f"f$ 2 2 2 f/$ 7%
Exponent (ias for nor'ali)e- Is is "9$% The -enor' (iase- exponent of 9 correspon-s to an n(iaseexponent of *"9$$ Infinit, an- NaNs ha!e a (iase- exponent of $9&. @ange increases fro' a(o t "9*%5JKxKJ"9%5 to a(o t "9*%95JKxK J"9%95
Floating Point @egisters
@egisters 1
Floating Point A--ition an0 (traction
A< A< A +@ A< A< * +@
Algorith's
"2 $2 %2 &2 <hecM for )eros Align the 'antissas A-- or s (tract the 'antissas Nor'ali)e the res lt
Floating*Point A--ition;0 (traction
0teps
"2 <o'pare exponent $2 0hift s'aller n '(er right ntil its exponent 'atches the larger n '(er %2 A--;0 (tract the significan-s &2 Nor'ali)e the s ' /2 @o n- the s ' if nee-e72 @enor'ali)e# if necessar,
6eci'al Floating Point A-- an0 (tract Exa'ples
Operands Alignment Normalize & round 6.144 102 0.06144 104 1.003644 105 +9.975 104 +9.975 104 + .0005 105 10.03644 104 1.004 105
Operands Alignment Normalize & round 1.076 10-7 1.076 10-7 7.7300 10-9 -9.987 10-8 -0.9987 10-7 + .0005 10-9 0.0773 10-7 7.730 10-9