You are on page 1of 74

Chapter 6.

Arithmetic
CENG 2213
Computer Organization
Instructor: Dr. Shen

1
Outline
A basic operation in all digital computers is
the addition or subtraction of two numbers.
 ALU – AND, OR, NOT, XOR

 Unsigned/signed numbers

 Addition/subtraction

 Multiplication

 Division

 Floating number operation

2
Adders

3
Addition of Unsigned Numbers
– Half Adder x 0 0 1 1
+y +0 +1 +0 +1
c s 00 01 01 10

Carry Sum

(a) The four possible cases 

Carry Sum
x y c s

0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0

(b) Truth table 

x
s
y
x s
HA
y c
c
4

(c) Circuit  (d) Graphical symbol
Addition and Subtraction of
Signed Numbers
xi yi Carry­in ci Sum si Carry­out ci +1

0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1

si   = xi yi ci + xi yi ci + xi yi ci + xi yi ci = x i ⊕ yi ⊕ ci
ci +1 = yi ci + xi ci + xi yi

E xample:

X 7 0 1 1 1 Carry­out xi Carry­in
+ Y = + 6 = +00 1 1 0 yi
1 1 0 0
ci+1 ci
Z 13 1 1 0 1 si

Legend for stage i

5
Figure 6.1. Logic specification for a stage of binary addition.
Addition and Subtraction of
Signed Numbers
A full adder (FA)
yi
c
i
xi
xi
yi si c
c i +1
i
ci
x
xi yi i
yi

ci + 1 Full adder ci
(FA)

s
i

(a) Logic for a single stage

6
Addition and Subtraction of
Signed Numbers
 n-bit
ripple-carry adder
 Overflow?

xn ­ 1
yn­ 1
x1 y1 x0 y0

cn ­ 1
c1
cn FA FA FA c0

sn ­ 1
s1 s0
Most significant bit Least significant bit
(MSB) position (LSB) position

(b) An n­bit ripple­carry adder

7
Addition and Subtraction of
Signed Numbers
 kn-bit ripple-carry adder

xk n ­ 1
yk n ­ 1 x2n ­ 1 y2n ­ 1
xn y n xn ­ 1
yn ­ 1 x0 y0

cn
n­bit n­bit n­bit c
c kn
adder adder adder 0

s s( s s s s
kn­ 1 k ­ 1) n 2n ­ 1 n n­ 1 0

(c) Cascade of k n­bit adders

Figure 6.2.  Logic for addition of binary vectors.

8
Addition and Subtraction of
Signed Numbers
 Addition/subtraction logic unit
yn ­ 1 y1 y0

Add/Sub
control

xn ­ 1 x1 x0

cn n­bit adder
c0

sn ­ 1 s1 s0

9
Figure 6.3. Binary addition­subtraction logic netw ork.
Make Addition Faster

10
Ripple-Carry Adder (RCA)
 Straight-forward design
 Simple circuit structure

 Easy to understand

 Most power efficient

 Slowest (too long critical path)

11
Adders
 We can view addition in terms of generate,
G[i], and propagate, P[i].

12
Carry-lookahead Logic
Carry Generate Gi = Ai Bi must generate carry when A = B = 1

Carry Propagate Pi = Ai xor Bi carry-in will equal carry-out here

Sum and Carry can be reexpressed in terms of generate/propagate/Ci:

Si = Ai xor Bi xor Ci = Pi xor Ci

Ci+1 = Ai Bi + Ai Ci + Bi Ci

= Ai Bi + Ci (Ai + Bi)

= Ai Bi + Ci (Ai xor Bi)

= Gi + Ci Pi

13
Carry-lookahead Logic
Reexpress the carry logic as follows:

C1 = G0 + P0 C0

C2 = G1 + P1 C1 = G1 + P1 G0 + P1 P0 C0

C3 = G2 + P2 C2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C0

C4 = G3 + P3 C3 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 C0

Each of the carry equations can be implemented in a two-level logic
network

Variables are the adder inputs and carry in to stage 0!

14
Carry-lookahead
Implementation
Ai
Pi @ 1 gate delay
Bi
Adder with Propagate and
Ci Si @ 2 gate delay s Generate Outputs

Gi @ 1 gate delay
Increasingly complex logic
C0 C0 C0
P0 C1 P0 P0
P1 P1
G0
P2 P2
P3
G0
P1 G0
C0
P2 P1
P0
P2
P1
G1 C3 P3
G0 P2
C2 G1
P1
P2
G2 P3
G1 C4
G2
P3

G3

15
Carry-lookahead Logic
C0
Cascaded Carry Lookahead
A0 S0 @2

B0
Carry lookahead
logic generates C1 @3
individual carries
A1 S1 @4

B1

sums computed C2 @3
much faster
A2 S2 @4
B2

C3 @3

A3 S3 @4

B3

C4 @3

16
Carry-lookahead Logic
x15­12 y15­12 x11­8 y11­8 x7­4 y7­4 x3­0 y3­0

c16 4­bit adder
c12
4­bit adder
c8
4­bit adder
c4
4­bit adder . c0

s15­12 s11­8 s7­4 s3­0

G3 I P3 I G2 I P 2I G 1I P 1I G 0I P0 I

Carry­lookahead logic

G0 II P 0 II

Figure 6.5.  16­bit carry­lookahead adder built from 4­bit adders (see Figure 6.4 b).

17
Carry-lookahead Logic
4 4 4 4 4 4 4 4

C16 A [15-12] B[15-12] C12 A [11-8] B[1 1-8] C8 A [7-4] B[7-4] C4 A [3-0] B[3-0] C0
4-bit Adder 4-bit Adder 4-bit Adder 4-bit Adder
P G P G P G P G @0
4 @8 4 @8 4 @7 4 @4
S[15-12] S[1 1-8] S[7-4] S[3-0]
@2 @3 @5 @2 @3 @5 @2 @3 @4 @2 @3

P3 G3 C3 P2 G2 C2 P1 G1 C1 P0 G0
C16 C0
C4 Lookahead Carry Unit C0
@5 @0
P3-0 G3-0
@3 @5

4 bit adders with internal carry lookahead
second level carry lookahead unit, extends lookahead to 16 bits

Group Propagate P = P3 P2 P1 P0
Group Generate G = G3 + G2P3 + G1P3P2 + G0P3P2P1

18
Unsigned
Multiplication

19
Manual Multiplication
Algorithm

1 1 0 1 (13)  Multiplicand M

× 1 0 1 1 (11)  Multiplier Q

1 1 0 1

1 1 0 1

0 0 0 0

1 1 0 1

1 0 0 0 1 1 1 1 (143)  Product P

(a) Manual multiplication algorithm

20
Array Multiplication
Multiplicand

P artial product 0 m3 0 m2 0 m1 0 m0
(PP0)
q0
0
PP1
p0
q1
0

r
PP2

lie
p1

tip
q2

ul
M
0
PP3
p2
q3
0
PP4 =p7 , p 6, ... p 0 = Product
p7 p6 p5 p4 p3

i)
Bit of incoming partial product (PP
mj

qi
Typical cell

Carry­out FA Carry­in

i +1)]
Bit of outgoing partial product [PP(
21
(b) Array implementation
X3 X2 X1 X0
Y3 Y2 Y1 Y0
X3Y0 X2Y0 X1Y0 X0Y0
X3Y1 X2Y1 X1Y1 X0Y1
X3Y2 X2Y2 X1Y2 X0Y2
X3Y3 X2Y3 X1Y3 X0Y3
P7 P6 P5 P4 P3 P2 P1 P0

22
Another Version of 4×4 Array
Multiplier

23
Array Multiplication
 What is the critical path (worst case signal
propagation delay path)?
 Assuming that there are two gate delays from
the inputs to the outputs of a full adder block,
the path has a total of 6(n-1)-1 gate delays,
including the initial AND gate delay in all
cells, for the n×n array.
 Any advantages/disadvantages?

24
Sequential Circuit Binary
Multiplier
Register A  (initially 0)
M
Shift right 1   1   0   1
Initial configuration
0 0   0   0   0 1   0   1   1
C an ­ 1 a0 qn ­ q0
1 C A Q
Multiplier Q 0 1   1   0   1 1   0   1   1 Add
Shift First cycle
Add/Noadd 0 0   1   1   0 1   1   0   1
control
1 0   0   1   1 1   1   0   1 Add
Shift Second cycle
0 1   0   0   1 1   1   1   0

n­bit 0 1   0   0   1 1   1   1   0 No add
adder Shift Third cycle
0 0   1   0   0 1   1   1   1
Control
MUX sequencer
1 0   0   0   1 1   1   1   1 Add
Shift Fourth cycle
0 0 0 1   0   0   0 1   1   1   1
mn ­ 1 m0
Product
Multiplicand M
(b) Multiplication example
(a) Register configuration

25
Signed Multiplication

26
Signed Multiplication
 Considering 2’s-complement signed operands, what will happen to
(-13)×(+11) if following the same method of unsigned multiplication?

1 0 0 1 1 ( ­ 13)
0 1 0 1 1 ( + 11)

1 1 1 1 1 1 0 0 1 1

1 1 1 1 1 0 0 1 1
Sign extension is
shown in blue 0 0 0 0 0 0 0 0

1 1 1 0 0 1 1

0 0 0 0 0 0

1 1 0 1 1 1 0 0 0 1 ( ­ 143)

27
Figure 6.8. Sign extension of negative multiplicand.
Signed Multiplication
 For a negative multiplier, a straightforward
solution is to form the 2’s-complement of both
the multiplier and the multiplicand and
proceed as in the case of a positive multiplier.
 This is possible because complementation of
both operands does not change the value or
the sign of the product.
 A technique that works equally well for both
negative and positive multipliers – Booth
algorithm.
28
Booth Algorithm
 Consider in a multiplication, the multiplier is
positive 0011110, how many appropriately
shifted versions of the multiplicand are added
in a standard procedure?
0 1 0 1 1 0 1
0 0 +1 +1 + 1+1 0
0 0 0 0 0 0 0
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 0 0 0 1 1 0

29
Booth Algorithm
 Since0011110 = 0100000 – 0000010, if we
use the expression to the right, what will
happen?
0 1 0 1 1 0 1
0 +1 0 0 0 ­ 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
2's complement of
1 1 1 1 1 1 1 0 1 0 0 1 1
the multiplicand
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 1 1 0 1
0 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 0 0 0 1 1 0

30
Booth Algorithm
 In general, in the Booth scheme, -1 times the shifted multiplicand
is selected when moving from 0 to 1, and +1 times the shifted
multiplicand is selected when moving from 1 to 0, as the
multiplier is scanned from right to left.
0 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0

0 +1 ­1 +1 0 ­ 1 0 +1 0 0 ­ 1 +1 ­ 1 + 1 0 ­ 1 0 0

Figure 6.10. Booth recoding of a multiplier.

31
Booth Algorithm

0 1 1 0 1 ( + 13) 0 1 1 0 1
× 1 1 0 1 0 (­ 6) 0 ­ 1 +1 ­ 1 0
0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 0 0 1 1
0 0 0 0 1 1 0 1
1 1 1 0 0 1 1
0 0 0 0 0 0
1 1 1 0 1 1 0 0 1 0 ( ­ 78)

Figure 6.11. Booth multiplication with a negative multiplier.

32
Booth Algorithm

Multiplier
V ersion of multiplicand
selected by biti
Bit i Bit i ­1

0 0 0 ×M
0 1 + 1 ×M
1 0 − 1 ×M
1 1 0 ×M

Figure 6.12. Booth multiplier recoding table.
33
Booth Algorithm
 Best case – a long string of 1’s (skipping over 1s)
 Worst case – 0’s and 1’s are alternating
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
Worst­case
multiplier
+1 ­ 1 +1 ­ 1 +1 ­ 1 +1 ­ 1 +1 ­ 1 +1 ­ 1 +1 ­ 1 +1 ­ 1

1 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0
Ordinary
multiplier
0 ­1 0 0 + 1 ­ 1 +1 0 ­ 1 +1 0 0 0 ­1 0 0

0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1
Good
multiplier 34
0 0 0 +1 0 0 0 0 ­1 0 0 0 +1 0 0 ­1
Fast Multiplication

35
Bit-Pair Recoding of
Multipliers
 Bit-pair recoding halves the maximum number of
summands (versions of the multiplicand).
Sign extension Implied 0 to right of LSB
1 1 1 0 1 0 0

0 0 − 1 +1 − 1 0

0 −1 −2

(a)  Example of bit­pair recoding derived from Booth recoding

36
Bit-Pair Recoding of
Multipliers

Multiplier bit­pair Multiplier bit on the right Multiplicand
i+1 selected at position i
i i−1

0 0 0 0 ×M
0 0 1 +1 ×M
0 1 0 +1 ×M
0 1 1 +2 ×M
1 0 0 −2 ×M
1 0 1 −1 ×M
1 1 0 −1 ×M
1 1 1 0 ×M

(b) Table of multiplicand selection decisions

37
Bit-Pair Recoding of
Multipliers
0 1 1 0 1
0 ­ 1 +1 ­ 1 0
0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 0 0 1 1
0 0 0 0 1 1 0 1
1 1 1 0 0 1 1
0 1 1 0 1 ( + 13) 0 0 0 0 0 0
× 1 1 0 1 0 (­ 6 ) 1 1 1 0 1 1 0 0 1 0 ( ­ 78 )

0 1 1 0 1
0 ­1 ­2
1 1 1 1 1 0 0 1 1 0
1 1 1 1 0 0 1 1
0 0 0 0 0 0
1 1 1 0 1 1 0 0 1 0

38

Figure 6.15.  Multiplication requiring only n/2 summands.
Carry-Save Addition of
Summands

1 1 0 1 (13)  Multiplicand M

× 1 0 1 1 (11)  Multiplier Q

1 1 0 1

1 1 0 1

0 0 0 0

1 1 0 1

1 0 0 0 1 1 1 1 (143)  Product P

(a) Manual multiplication algorithm

39
Carry-Save Addition of
Summands
Multiplicand

P artial product 0 m3 0 m2 0 m1 0 m0
(PP0)
q0
0
PP1
p0
q1
0

r
PP2

lie
p1

tip
q2

ul
M
0
PP3
p2
q3
0
PP4 =p7 , p 6, ... p 0 = Product
p7 p6 p5 p4 p3

i)
Bit of incoming partial product (PP
mj

qi
Typical cell

Carry­out FA Carry­in

i +1)]
Bit of outgoing partial product [PP(
40
(b) Array implementation
Carry-Save Addition of
Summands
 CSA speeds up the addition process.
0 m 3q 0 m2 q0 m1 q0 m0 q0
m 3q 1 m2 q 1 m1 q1 m 0 q1

FA FA FA FA 0

m 3 q2 m 2q 2 m1 q2 m0 q2

FA FA FA FA 0

m 3q 3 m 2q 3 m1 q3 m0 q3

FA FA FA FA 0

p7 p6 p5 p4 p3 p2 p1 p0

41

(a) Ripple­carry array (Figure 6.6 structure)
Carry-Save Addition of
Summands
0 m 3q 0 m2 q0 m1 q0 m0 q0
m 3q 1 m2 q 1 m1 q1 m 0 q1

m3 q2 m2 q 2 m1 q2 m0 q2
FA FA FA FA 0

m 3q 3 m 2 q3 m 1q 3 m0 q3 0

FA FA FA FA

FA FA FA FA 0

p7 p6 p5 p4 p3 p2 p1 p0

(b) Carry­save array

Figure 6.16.  Ripple­carry and carry­save arrays for the
Figure 6.16. Ripple-carry andmultiplication operation M the multiplication operation M × Q = P for 4-bit operands.
carry-save arrays for x Q = P for 4­bit operands.

42
Carry-Save Addition of
Summands
 The delay through the carry-save array is somewhat
less than delay through the ripple-carry array. This is
because the S and C vector outputs from each row
are produced in parallel in one full-adder delay.
 Consider the addition of many summands, we can:
 Group the summands in threes and perform carry-save addition on
each of these groups in parallel to generate a set of S and C vectors
in one full-adder delay
 Group all of the S and C vectors into threes, and perform carry-save
addition on them, generating a further set of S and C vectors in one
more full-adder delay
 Continue with this process until there are only two vectors remaining
 They can be added in a RCA or CLA to produce the desired product
43
Carry-Save Addition of
Summands
1 0 1 1 0 1 (45) M

X 1 1 1 1 1 1 (63) Q

1 0 1 1 0 1 A
1 0 1 1 0 1 B

1 0 1 1 0 1 C
1 0 1 1 0 1 D
1 0 1 1 0 1 E
1 0 1 1 0 1 F

1 0 1 1 0 0 0 1 0 0 1 1 (2,835) Product

Figure 6.17.  A multiplication example used to illustrate carry­save addition as shown in Figure 6.18.

44
1 0 1 1 0 1 M

x 1 1 1 1 1 1 Q

1 0 1 1 0 1 A

1 0 1 1 0 1 B

1 0 1 1 0 1 C

1 1 0 0 0 0 1 1 S
1
0 0 1 1 1 1 0 0 C
1

1 0 1 1 0 1 D
1 0 1 1 0 1 E
1 0 1 1 0 1 F

1 1 0 0 0 0 1 1 S
2
0 0 1 1 1 1 0 0 C
2

1 1 0 0 0 0 1 1 S1

0 0 1 1 1 1 0 0 C
1
1 1 0 0 0 0 1 1 S2
1 1 0 1 0 1 0 0 0 1 1 S
3
0 0 0 0 1 0 1 1 0 0 0 C3
0 0 1 1 1 1 0 0 C2
0 1 0 1 1 1 0 1 0 0 1 1 S4
+ 0 1 0 1 0 1 0 0 0 0 0 C
4
1 0 1 1 0 0 0 1 0 0 1 1 Product

45
Figure 6.18. The multiplication example from Figure 6.17 performed using
carry­save addition.
Carry-Save Addition of
Summands
F E D C B A
Level 1 CSA

C2 S2 C1 S1
Level 2 CSA

C2 C3 S3
Level 3 CSA

C4 S4
Final addition
+
Product

Figure 6.19.  Schematic representation of the carry­save
Figure 6.19. Schematic representation of the carry-save addition operations in Figure 6.18.
addition operations in Figure 6.18.

46
Carry-Save Addition of
Summands
 When the number of summands is large, the
time saved is proportionally much greater.
 Some omitted issues:
 Sign-extension
 Computation width of the final CLA/RCA
 Bit-pair recoding

47
Integer Division

48
Manual Division
21 10101
13 274   1101 100010010
26 1101
14 10000
13 1101
1 1110
1101
1

Figure 6.20. Longhand division examples.
49
Longhand Division Steps
 Position the divisor appropriately with respect to the
dividend and performs a subtraction.
 If the remainder is zero or positive, a quotient bit of 1
is determined, the remainder is extended by another
bit of the dividend, the divisor is repositioned, and
another subtraction is performed.
 If the remainder is negative, a quotient bit of 0 is
determined, the dividend is restored by adding back
the divisor, and the divisor is repositioned for
another subtraction.
50
Circuit Arrangement
Shift left

an an ­ 1 a0 qn ­ 1 q0

A Dividend Q
Quotient
setting

n + 1­bit Add/Subtract
adder
Control
sequencer

0 mn ­ 1 m0

Divisor M

51
Figure 6.21. Circuit arrangement for binary division.
Restoring Division
 Shift A and Q left one binary position
 Subtract M from A, and place the answer
back in A
 If the sign of A is 1, set q to 0 and add M
0
back to A (restore A); otherwise, set q0 to 1
 Repeat these steps n times

52
10
11 1000
11

Examples
10

Initially 0 0 0 0 0 1 0 0 0
0 0 1 0 1
Shift 0 0 0 0 1 0 0 0
Subtract 1 1 0 1 1 First cycle
Set q 0 1 1 1 1 0
Restore 1 1
0 0 0 0 1 0 0 0 0
Shift 0 0 0 1 0 0 0 0
Subtract 1 1 1 0 1
Set q 0 1 1 1 1 1 Second cycle
Restore 1 1
0 0 0 1 0 0 0 0 0
Shift 0 0 1 0 0 0 0 0
Subtract 1 1 1 0 1
Set q 0 0 0 0 0 1 Third cycle

Shift 0 0 0 1 0 0 0 0 1
Subtract 1 1 1 0 1 0 0 1
Set q 0 1 1 1 1 1
Fourth cycle
Restore 1 1
0 0 0 1 0 0 0 1 0

Remainder Quotient

53

Figure 6.22. A restoring­division example.
Nonrestoring Division
 Avoid the need for restoring A after an
unsuccessful subtraction.
 Any idea?
 Step 1: (Repeat n times)
 If the sign of A is 0, shift A and Q left one bit position and
subtract M from A; otherwise, shift A and Q left and add
M to A.
 Now, if the sign of A is 0, set q0 to 1; otherwise, set q0 to
0.
 Step2: If the sign of A is 1, add M to A
54
Initially 0 0 0 0 0 1 0 0 0
Examples Shift
0
0
0
0
0
0
1
0
1
1 0 0 0 First cycle
Subtract 1 1 1 0 1
Set q 0 1 1 1 1 0 0 0 0 0

Shift 1 1 1 0 0 0 0 0
Add 0 0 0 1 1 Second cycle
Set q 0 1 1 1 1 1 0 0 0 0

Shift 1 1 1 1 0 0 0 0
Add 0 0 0 1 1 Third cycle
Set q 0 0 0 0 0 1 0 0 0 1

Shift 0 0 0 1 0 0 0 1
Subtract 1 1 1 0 1 Fourth cycle
Set q 0 1 1 1 1 1 0 0 1 0

Quotient

Add 1 1 1 1 1
0 0 0 1 1 Restore remainder
0 0 0 1 0

Remainder

55

Figure 6.23. A nonrestoring­division example.
Floating-Point Numbers
and Operations

56
Floating-Point Numbers
 So far we have dealt with fixed-point numbers (what
is it?), and have considered them as integers.
 Floating-point numbers: the binary point is just to the
right of the sign bit.
B =b0 .b−1b−2  b−( n −1)
F ( B ) =−b0 ×2 0 +b−1 ×2 −1 +b−2 ×2 −2 + +b−( n −1) ×2 −( n −1)

 Where the range of F is:
−1 ≤F ≤1 −2 −( n −1)
 The position of the binary point is variable and is
automatically adjusted as computation proceeds.

57
Floating-Point Numbers
 What are needed to represent a floating-point
decimal number?
 Sign

 Mantissa (the significant digits)

 Exponent to an implied base (scale factor)

 “Normalized” – the decimal point is placed to
the right of the first (nonzero) significant digit.
58
IEEE Standard for Floating-
Point Numbers
 Think about this number (all digits are decimal):
±X1.X2X3X4X5X6X7×10±Y1Y2
 It is possible to approximate this mantissa precision
and scale factor range in a binary representation
that occupies 32 bits: 24-bit mantissa (1 sign bit for
signed number), 8-bit exponent.
 Instead of the signed exponent, E, the value actually
stored in the exponent field is an unsigned integer
E’=E+127, so called excess-127 format

59
IEEE Standard 32 bits

S E′ M

Sign of
number : 8­bit signed 23­bit
0 signifies + exponent in mantissa fraction
1 signifies ­ excess­127
representation
E ′ ­ 127
Value represented= ±1.M × 2

(a) Single precision

0 00 10 1 00 0 0 0 10 1 0 . . . 0

(101000)2=4010, 40-127=-87
Value represented= 1.001010… 0 × 2­ 87
(b) Example of a single­precision number

64 bits

S E′ M

Sign
11­bit excess­1023 52­bit
exponent mantissa fraction
E ′ ­ 1023
Value represented= ±1.M × 2

(c) Double precision

60

Figure 6.24. IEEE standard floating­point formats.
IEEE Standard
 For excess-127 format, 0 ≤ E’ ≤ 255.
However, 0 and 255 are used to represent
special value. So actually 1 ≤ E’ ≤ 254. That
means -126 ≤ E ≤ 127.
 Single precision uses 32-bit. The value range
is from 2-126 to 2+127.
 Double precision used 64-bit. The value
range is from 2-1022 to 2+1023.

61
Two Aspects
 If a number is not normalized, it can always be put in normalized
form by shifting the fraction and adjusting the exponent.
excess­127 exponent

0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 ...

(There is no implicit 1 to the left of the binary point.)
(100001000)2=13610, 136-127=-9
Value represented = + 0.0010110… × 2 9

(a) Unnormalized value

0 1 0 0 0 0 1 0 1 0 1 1 0 ...

6
6+127=133. 13310, = (100000101)2 Value represented = + 1.0110… × 2

(b) Normalized version
62

Figure 6.25. Floating­point normalization in IEEE single­precision format.
Two Aspects
 As computations proceed, a number that
does not fall in the representable range of
normal numbers might be generated.
 It requires an exponent less than -126
(underflow) or greater than +127 (overflow).
Both are exceptions that need to be
considered.

63
Special Values
 The end value 0 and 255 are used to represent
special values.
 When E’=0 and M=0, the value exact 0 is
represented. (±0)
 When E’=255 and M=0, the value ∞ is represented.
(± ∞)
 When E’=0 and M≠0, denormal numbers are
represented. The value is ±0.M×2-126.
 When E’=255 and M≠0, Not a Number (NaN).

64
Exceptions
A processor must set exception flags if any of
the following occur in performing operations:
underflow, overflow, divide by zero, inexact,
invalid.
 When exception occurs, the results are set to
special values.

65
Arithmetic Operations on
Floating-Point Numbers
 Add/Subtract rule
 Choose the number with the smaller exponent and shift its mantissa right a
number of steps equal to the difference in exponents.
 Set the exponent of the result equal to the larger exponent.
 Perform addition/subtraction on the mantissas and determine the sign of the
result.
 Normalize the resulting value, if necessary.
 Multiply rule
 Add the exponents and subtract 127.
 Multiply the mantissas and determine the sign of the result.
 Normalize the resulting value, if necessary.
 Divide rule
 Subtract the exponents and add 127.
 Divide the mantissas and determine the sign of the result.
 Normalize the resulting value, if necessary.

66
Guard Bits and Truncation
 During the intermediate steps, it is important
to retain extra bits, often called guard bits, to
yield the maximum accuracy in the final
results.
 Removing the guard bits in generating a final
result requires truncation of the extended
mantissa – how?

67
Guard Bits and Truncation
 Chopping – biased, 0 to 1 at LSB. 0.b-1b-2b-3000 -- 0.b-1b-2b-31110.b-1b-2b-3
 Von Neumann Rounding (any of the bits to be removed are 1,
the LSB of the retained bits is set to 1) – unbiased, -1 to +1 at
LSB. All 6-bit fractions with b-4b-5b6 not equal to 000 are truncated to 0.b-1b-21
 Why unbiased rounding is better for the cases that many
operands are involved?
 Rounding (A 1 is added to the LSB position of the bits to be
retained if there is a 1 in the MSB position of the bits being
removed) – unbiased, -½ to +½ at LSB.
 Round to the nearest number or nearest even number in case of a tie
(0.b-1b-20000 - 0.b-1b-20, 0.b-1b-21100 - 0.b-1b-21+0.001)
 Best accuracy
 Most difficult to implement

68
Implementing Floating-Point
Operations
 Hardware/software

 In most general-purpose processors, floating-
point operations are available at the machine-
instruction level, implemented in hardware.
 In high-performance processors, a significant
portion of the chip area is assigned to
floating-point operations.
 Addition/subtraction circuitry

69
EA′ EB′
MA MB

A : SA, E A ′, M A M  of number
32­bit operands
B : SB, E B ′, M B 8­bit with smaller E ′
SWAP
subtractor
M  of number
with larger E ′
sign SHIFTER

SA SB n  bits
n = EA′ ­ EB′ to right

Add /
Subtract
Combinational
Add/Sub Mantissa
CONTROL
network adder/subtractor
Sign

EA′ EB′
Magnitude M
Leading zeros
detector
MUX

X Normalize and
E′
round

8­bit
subtractor

E′ ­ X
32­bit
R : S R E R′ MR result
R = A+B

70
Figure 6.26. Floating­point addition­subtraction unit.
Requirements for Homework6
 5.6. (a): 3 credits
 5.6. (b):
 Draw a figure to show how program words are mapped on the
cache blocks: 4
 sequence of reads from the main memory blocks into cache
blocks:4
 total time for reading the blocks from the main memory into the
cache:4
 Executing the program out of the cache:
 Outer loop excluding Inner loop:4
 Inner loop:4
 End section of program:4
 Total execution time:3
 Due time: class on Oct. 18

71
Hints for Homework6
 Assume that consecutive addresses refer to consecutive words. The
cycle time is for one word
 Assume this problem does not use load-through, which means
when a read miss occurs, the block of words that contains the
requested word is copied from the main MEM into the cache, after
the entire block is loaded into the cache, the particular word
requested is forwarded to the processor
 Total time for reading the blocks from the main memory into the
cache: the number of readsx128x10
 Executing the program out of the cache
 MEM word size for instructionsxloopNumx1
 Outer loop excluding Inner loop: (outer loop word size-inner loop word
size)x10x1
 Inner loop: inner loop word sizex20x10x1
 MEM word size from MEM 23 to 1200 is 1200-22
 MEM word size from MEM 1200 to 1500(end) is 1500-1200
72
Homework 7
 Addition and Subtraction of Signed Numbers 5-9, Oct. 20 (Barret, Felix,
Washington)
 Carry-lookahead Addition 11-18, Oct. 20 (Kyle White, Jose Jo)
 Unsigned Multiplication 20-25, Oct. 20 (Tannet Garrett, Garth
Gergerich, Gabriel Graderson)
 Signed Multiplication 26-28 (Shen)
 Booth Alg. 29-34, Oct. 25 (Ashraf Hajiyer)
 Fast Multiplication
 Bit-Pair Recoding of Multipliers 36-38, Oct. 25(Alex, Suzanne, Scott)
 Carry-Save Addition of Summands 39-47, Oct. 25 (Jason, Jordan,
Chris)
7. Integer Division
 Restoring Division 49-52, Oct. . 27 (Kyle, Brandan, Alex Shipman)
 Nonrestoring Division 53-55, Oct. 27 (Zach, Eric, Chase)

Each presentation is limited to 15 minutes including 2 minutes for questions
73
Exercise for Oct.23
 Read “Booth’s algorithm and Bit-Pair
Recoding” in the textbook(6.4.1 & 6.5.1)
 Calculate 2’s complement multiplication
(+4)×(-7) using Booth’s algorithm and Bit-Pair
Recoding. (Booth’s algorithm and Bit-Pair
Recoding will be introduced on Oct.25)
 You don’t need to hand in this exercise

74