10 views

Uploaded by Uma Mahesh

- Dsp Progrtechniques
- The Guide to ImplemeThe guide to implementing 2D platformers - Higher-Order Funnting 2D Platformers - Higher-Order Fun
- Vector Processor
- sample_midtermI-1-2
- dsp 2 amrks
- 02slide Elementary Programming (1)
- A 245791
- Implementation and Comparison of Effective Area Efficient Architectures for CSLA.doc
- Maths Quest 9 - Chapter 1
- week 3 lesson
- Migrate Mysql to SQL Server
- Mathematics
- Chapter 4.2 Algebraic Fractions ENRICH
- binary coding worksheet
- Estimation
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- The Library Book
- Shoe Dog: A Memoir by the Creator of Nike
- Never Split the Difference: Negotiating As If Your Life Depended On It
- Maybe You Should Talk to Someone: A Therapist, HER Therapist, and Our Lives Revealed

You are on page 1of 23

A method of representation of real numbers that can support a wide range of values. A typical number that can be represented exactly is of the form:

**Significant digits × baseexponent
**

The term floating point refers to the fact that the

radix point can "float" i.e., it can be placed anywhere relative to the significant digits of the number.

Floating point numbers approximate real numbers Floating numbers have large dynamic range .

This standard specifies how single precision (32 bit) and double precision (64 bit) floating point numbers are to be represented. The IEEE 754 has produced a standard for floating point arithmetic. as well as how arithmetic should be carried out on them .

The IEEE 754 standard specifies a binary32 as having: • Sign bit: 1 bit • Exponent width: 8 bits • Significand precision: 24 (23 explicitly stored) The base is 2 .

which is the sign of the significand as well. Sign bit determines the sign of the number. Sign bit=0 if the number is positive =1 if the number is negative .

a bias of ‘127’ is added to the actual exponent in order to get the stored exponent. A stored value of 200 indicates an exponent of (200-127). Exponents of -127 (all 0s) and +128 (all 1s) are reserved for special numbers. . or 73.The exponent field needs to represent both positive and negative exponents. Thus. an exponent of zero means that 127 is stored in the exponent field. To do this.

Also known as ‘Mantissa’ The true significand includes 23 fraction bits to the right of the binary point and an implicit leading bit with value 1 unless the exponent is stored with all zeros. Thus only 23 fraction bits of the significand appear in the memory format but the total precision is 24 bits .

The bits are laid out as follows: sign exponent significand 31 30 23 22 0 .

then v=(. (e) If e=0 and f=0. then v= (. (b) If e=255 and f=0. then v = ( . (d) If e =0 and f=0. (zero). then v= NaN.I)s (c) If 0<e<255.1)s2 -126(0.l)s 0. .1)s2e-127 (1. then v=(.The value of the number represented in single precision format is as follows: (a)If e=255 and f=0.f). f).

floating-point numbers are typically stored in normalized form. This basically puts the radix point after the first non-zero digit. five is represented as 5. . In order to maximize the quantity of representable numbers. In normalized form.0 × 100.

A nice little optimization is available to us in base two. by way of 23 fraction bits. and don't need to represent it explicitly. we can just assume a leading digit of 1. Thus. . the mantissa has effectively 24 bits of resolution. since the only possible non-zero digit is 1. As a result.

The storage format of double precision is as shown sign bit: 1 bit Exponent width:11 bits significand precision: 52 bits(implicit) The bias for exponent is 1023 Sign exponent significand 63 62 52 51 0 .

put the bits in three groups. 1 10000001 10110011001100110011010 First. Bit ‘31’ (the leftmost bit) show the sign of the number. Convert the following single-precision IEEE 754 number into a floating-point decimal value. Bits ‘0-22’ (on the right) give the fraction . Bits ‘23-30’ (the next 8 bits) are the exponent.

otherwise positive. Since this is a single-precision number. 10000001bin = 129ten Remember that we will have to subtract a bias from this exponent to find the power of 2. so the number is negative. If this bit is a 1. Get the exponent and the correct bias. look at the sign bit. the bias is 127. the number is negative. The exponent is simply a positive binary number. . Here this bit is 1.Now.

Binary fractions look like this: 0. Convert the fraction string into base ten.1 = (1/2) = 2-1 0. so conversion is a little different. The binary string represents a fraction.01 = (1/4) = 2-2 0.001 = (1/8) = 2-3 . This is the trickiest step.

.. Note that this number is just an approximation on some decimal number.. There will most likely be some error. for this example. So. we multiply each digit by the corresponding power of 2: 0.. In this case.10110011001100110011010bin = 1/2 + 1/8 + 1/16 + .. the fraction is about 0.10110011001100110011010bin = 1*2-1+ 0*2-2 + 1*2-3 + 1*2-4 + 0*2-5 + 0 * 2-6 + . 0.7000000476837158.

This is all the information we need.8.7000000476837158) * 2 129-127 = -6. We can put these numbers in the expression: (-1)sign bit * (1+fraction) * 2 exponent .bias = (-1)1 * (1. .8 The answer is approximately -6.

Convert 0.101562510 = 0. 0.625 × 2 = 1. 0. 0.40625 0 Generate 0 and continue. . Converting: 0.8125 0 Generate 0 and continue.25 1 Generate 1 and continue with the rest.625 1 Generate 1 and continue with the rest.1015625 to IEEE 32-bit floating point format.00011012.1015625 × 2 = 0. So 0.203125 × 2 = 0.0 1 Generate 1 and nothing remains.5 × 2 = 1.5 0 Generate 0 and continue. 0.8125 × 2 = 1. 0.203125 0 Generate 0 and continue.25 × 2 = 0. 0.40625 × 2 = 0.

exponent is -4 + 127 = 123 = 011110112. Normalize: 0.1015625 is 00111101110100000000000000000000 . sign bit is 0.1012 × 2-4.00011012 = 1. So 0. Mantissa is 10100000000000000000000.

Binary Fractional Numbers “Even” when least significant bit is 0 Half way when bits to right of rounding position = Examples Round to nearest 1/4 (2 bits right of binary point) Value Binary Rounded Action Rounded Value 2 3/32 10.000112 10.002 (1/2—up) 3 2 5/8 10.012 (>1/2—up) 2 1/4 2 7/8 10.001102 10.102 (1/2—down) 2 1/2 100…2 .002 (<1/2—down) 2 2 3/16 10.111002 11.101002 10.

Operands (–1)s1 M1 2E1 (–1)s2 M2 2E2 Assume E1 > E2 + (–1)s1 m1 E1–E2 (–1)s2 m2 (–1)s m Exact Result (–1)s M 2E Sign s. shift M left k positions. increment E if M < 1. decrement E by k Overflow if E out of range Round M to fit frac precision . significand M: ▪ Result of signed align & add Fixing Exponent E: E1 If M ≥ 2. shift M right.

25 x 10 ** 3 + 2.000263 x 10 ** 3 -------------------= 3.25 x 10 ** 3 + 0.3.250263 x 10 ** 3 .63 x 10 ** -1 ----------------first step: align decimal points second step: add 3.

- Dsp ProgrtechniquesUploaded byRajendra Nagar
- The Guide to ImplemeThe guide to implementing 2D platformers - Higher-Order Funnting 2D Platformers - Higher-Order FunUploaded byMarcelo Lima Souza
- Vector ProcessorUploaded byAdedokun Abayomi
- sample_midtermI-1-2Uploaded byVanessa 馨妮 Wu
- dsp 2 amrksUploaded bystartedforfun
- 02slide Elementary Programming (1)Uploaded byAli Ib Tarsha
- A 245791Uploaded bysamarammar908
- Implementation and Comparison of Effective Area Efficient Architectures for CSLA.docUploaded byNsrc Nano Scientifc
- Maths Quest 9 - Chapter 1Uploaded byBella Carr
- week 3 lessonUploaded byapi-252515648
- Migrate Mysql to SQL ServerUploaded byDaud Valentino Simbolon
- MathematicsUploaded byDzoni
- Chapter 4.2 Algebraic Fractions ENRICHUploaded byjuriah binti ibrahim
- binary coding worksheetUploaded byapi-435314926
- EstimationUploaded bymonch1998

- 03 Numeric Basics.pptUploaded byGîrju Andrada
- 2014-05-17 Ntd Files - Structure and Illustration of ContentUploaded byManikantan Thanayath
- Logic DesignCh05Uploaded byAkshay Nagar
- z80Uploaded byTaniaBonitaRosyardi
- Brent Kung Adder 16 bit full codeUploaded byShavel Kumar
- Carry Lookahead MethodUploaded byVijay Panwar
- Bitwise Operators in CUploaded byRamabhadrarao Maddu
- 8085 Lab AssignmentUploaded byShaswata Dutta
- 1 Number SystemsUploaded byDharamvirSinghRathore
- Design 2Uploaded bySeham123123
- Chapter 01Uploaded bygood2000mo2357
- ch04.pptUploaded bySri Gopika
- Number System and CodesUploaded byvsbist
- switching theory logic designUploaded bySobha
- Exp 1- Number System-2017Uploaded byEdwin Quinlat Deviza
- s7 Symbol Table Data TypeUploaded bySandip Dhakal
- 1. Avr Asm Bit Manipulation - Avr Asm IntroductionUploaded byadrian_viso
- Binary to DecimalUploaded bySmilingSana
- Nvidia Cuda Floating PointUploaded bykarimM
- Rangkaian AritmetikaUploaded byamamangh
- RotateUploaded byNooraFukuzawa Nor Fauziah
- modbusPICUploaded bySteevens Garrido
- 08.Digital Introduction.pdfUploaded byMd Arif
- CLA MultiplierUploaded byAniruddh Jain
- Chapter10 ALUUploaded bySaúl García Hernández
- Encoder & Decoder PptUploaded byetasuresh
- Digital Principles UNIT 1 FinalUploaded bySathish Kumar
- lec9adder.pdfUploaded byIndah Kurnia Larasati
- Computer Artihmetic COUploaded byDwarakaManeekantaAddagiri
- 03 Number SystemsUploaded bynshivegowda