Professional Documents
Culture Documents
1
2
Introduc)on
• Computers are built using 7ny electronic switches.
– Typically made up of MOS transistors.
– The state of the switches are typically expressed in binary (ON/OFF).
• To design arithme7c circuits for use in computers, we need to
work with binary numbers.
– How to represent numbers in binary?
– How to carry out various arithme7c opera7ons in binary?
– How to implement them efficiently in hardware?
2 2
1
02/09/17
Representa)on of Integers
• Unsigned integer number representa7on
– For n-bit binary, range is 0 to (2n-1).
• Signed integer number representa7on
– For n-bit 1’s complement, range is -(2n-1-1) to +(2n-1-1).
– For n-bit 2’s complement, range is -2n-1 to +(2n-1-1).
– For both the representa7ons, subtrac7on can be done using addi7on.
– 2’s complement representa7on is most widely used.
3 2
4 2
2
02/09/17
Example 1 :: 6 – 2
1’s complement of 2 = 1101
2007 Structure
Example 2 :: 3 – 5
1’s complement of 5 = 1010
6 2
3
02/09/17
• How to compute A – B ?
– Compute the 2’s complement of B (say, B2).
– Compute R = A + B2
– If a carry is obtained aZer addi7on is ‘1’:
• Ignore the carry.
• The result is a posi7ve number.
Else
• The result is nega7ve, and is in 2’s complement form in R.
7 2
Example 1 :: 6 – 2
2’s complement of 2 = 1101 + 1 = 1110
8 2
4
02/09/17
Example 2 :: 3 – 5
3 :: 0011
Assume 4-bit representa7ons.
-5 :: 1011
1110 = -2 Since there is no carry, the result is
nega7ve.
1110 is the 2’s complement of
0010, that is, it represents –2.
9 2
10 2
5
02/09/17
11 2
Full Adder
Inputs Outputs
A S
A B Cin S Cout B FA
Cin Cout
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0 S = A’.B’.Cin + A’.B.Cin’ + A.B’Cin’ + A.B.C
0 1 1 0 1 = A ⊕ B ⊕ Cin
1 0 0 1 0 Cout = B.Cin + A.Cin + A.B + A.B.Cin
= A.B + B.Cin + A.Cin
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1
12 2
6
02/09/17
A A HALF C OR Cout
B B ADDER S A HALF C
Cin B ADDER S S
13 2
14 2
7
02/09/17
15 2
16 2
8
02/09/17
An-1 Bn-1 A2 B2 A1 B1 A0 B0
FAn-1
Cn-1
…… C3
FA2
C2
FA1
C1
FA0
C0
Delay for C1 = 2δ
Cn Sn-1 S2 S1 S0 Delay for C2 = 4δ
Delay for Cn-1 = 2(n-1)δ
Two numbers: An-1…A2A1A0 and Bn-1…B2B1B0 Delay for Cn = 2nδ
Input carry: C0
Delay for S0 = 3δ
Sum: Sn-1…S2S1S0
Delay for S1 = 2δ + 3δ = 5δ
Output carry: Cn
Delay for S2 = 4δ + 3δ = 7δ
Delay for Sn-1 = 2(n-1)δ + 3δ = (2n+1) δ
Delay is propor7onal to n
17 2
FAn-1
Cn-1
…… C3
FA2
C2
FA1
C1
FA0
C0 = 1
Cn Sn-1 S2 S1 S0
18 2
9
02/09/17
A Parallel Adder/Subtractor
An-1 A1 A0 Bn-1 B1 B0
ADD’ / SUB
… xor … xor xor
…
Sn-1 S2 S1 S0
19 2
END OF LECTURE 33
20 2
10
02/09/17
DR. KAMALIKA
DR.
PROF. KAMALIKA
INDRANIL
DATTA
DATTA
SENGUPTA
DEPARTMENT OFOF
DEPARTMENT COMPUTER
DEPARTMENT COMPUTERSCIENCE
OF COMPUTER AND
SCIENCE
SCIENCE ENGINEERING,
AND
AND NITKHARAGPUR
ENGINEERING, IIT
ENGINEERING, NIT MEGHALAYA
MEGHALAYA
21
2
22 2
11
02/09/17
23 2
i-1 i i
Ci+1 = Gi + ∑ Gk ∏ Pj + C0 ∏ Pj
k=0 j=k+1 j=0
24 2
12
02/09/17
S2 = P2 ⊕ C2
S3 = P3 ⊕ C3
25 2
S2 = P2 ⊕ C2
S3 = P3 ⊕ C3
26 2
13
02/09/17
S2 = P2 ⊕ C2
S3 = P3 ⊕ C3
27 2
Circuit G3
C3
P2
G2
C2 P1
G1
P0
C1 G0
C0
28 2
14
02/09/17
B3 A3 B2 A2 B1 A1 B0 A0
Gi and Pi Generator 3δ
G3 P3 G2 P2 G1 P1 G0 P0
C3 C2 C1 C0
C4
S3 S2 S1 S0
29 2
Problem: Carry propaga5on between modules s5ll slows down the adder
30 2
15
02/09/17
• Solu7on:
– Use a second level of carry look-ahead mechanism to generate the
input carries to the CLA blocks in parallel.
– The second level of CLA generates C4, C8, C12 and C16 in parallel with
two gate delays (2δ).
– For larger values of n, more CLA levels can be added.
• Delay calcula7on of a 16-bit adder:
a) For original single-level CLA: 14δ
b) For modified two-level CLA: 10δ
31 2
n TCLA TRCA
4 8δ 9δ
16 10δ 33δ TCLA = (6 + 2 log4 n ) δ
32 12δ 65δ
64 12δ 129δ
TRCA = (2n + 1) δ
128 14δ 257δ
256 14δ 513δ
32 2
16
02/09/17
33 2
34 2
17
02/09/17
35 2
Variable-sized adder
• A 16-bit carry select adder with variable block sizes of 2-2-3-4-5 is shown.
• Total delay is 2 full adder delays, plus 4 MUX delays.
36 2
18
02/09/17
37 2
A set of full adders generate carry The sum and carry vectors are
and sum bits in parallel added later (with proper shiZing)
38 2
19
02/09/17
Cn-1 Sn-1 C2 S2 C1 S1 C0 S0
39 2
Adding m Numbers:
Some Examples CSA CSA
m=3
CSA m=6
CSA
m=4
CSA CSA
CSA
40 2
20
02/09/17
END OF LECTURE 34
41 2
42
2
21
02/09/17
43 2
A B Fn-1 F1 F0 F
… Carry
…
out
NOR
ALU C Flag
44 2
22
02/09/17
45 2
46 2
23
02/09/17
A General Case
A3 A2 A1 A0
• Each AiBj is called a par7al
B3 B2 B1 B0 product.
---------------------- • Genera7ng the par7al
A3 B 0 A 2 B 0 A 1 B 0 A 0 B 0 products is easy.
A3B 1 A 2B 1 A 1B 1 A 0B 1 • Requires just an AND
gate for each par7al
A3B 2 A 2B 2 A 1B 2 A 0B 2
product.
A 3B 3 A 2B 3 A 1B 3 A 0B 3 • Adding all the n-bit par7al
------------------------------------- products in hardware is more
difficult.
47 2
48 2
24
02/09/17
of hardware. q0
0
• n2 mul7plica7on cells for PP1 p0
an n x n mul7plier. q1
0
• Advantage is that it is very PP2 p1
fast. q2
0
PP3 p2
q3
0
,
p7 p6 p5 p4 p3
Product: p7 p6 p5 p4 p3 p2 p1 p0
49 2
50 2
25
02/09/17
51 2
START
A = 0; C = 0; M: n-bit mul7plicand
COUNT = n;
M = mul7plicand; Q: n-bit mul7plier
Q = mul7plier;
A: n-bit temporary register
0 1
Q0
C: 1-bit carry out from adder
A=A+0 A=A+M
COUNT = 0? STOP
52 2
26
02/09/17
C A Q
Example 1: (10) x (13)
0 0 0 0 0 0 0 1 1 0 1 Initialization
Assume 5-bit numbers.
0 0 1 0 1 0 0 1 1 0 1 A = A + M
Step 1
M: (0 1 0 1 0)2 0 0 0 1 0 1 0 0 1 1 0 Shift
Q: (0 1 1 0 1)2
0 0 0 1 0 1 0 0 1 1 0 A = A + 0 Step 2
Product = 130 0 0 0 0 1 0 1 0 0 1 1 Shift
= (0 0 1 0 0 0 0 0 1 0)2 0 0 1 1 0 0 1 0 0 1 1 A = A + M Step 3
0 0 0 1 1 0 0 1 0 0 1 Shift
0 1 0 0 0 0 0 1 0 0 1 A = A + M
Step 4
0 0 1 0 0 0 0 0 1 0 0 Shift
0 0 1 0 0 0 0 0 1 0 0 A = A + 0
Step 5
0 0 0 1 0 0 0 0 0 1 0 Shift
53 2
C A Q
Example 2: (29) x (21)
0 0 0 0 0 0 1 0 1 0 1 Initialization
Assume 5-bit numbers.
0 1 1 1 0 1 1 0 1 0 1 A = A + M Step 1
M: (1 1 1 0 1)2 0 0 1 1 1 0 1 1 0 1 0 Shift
Q: (1 0 1 0 1)2
0 0 1 1 1 0 1 1 0 1 0 A = A + 0 Step 2
Product = 609 0 0 0 1 1 1 0 1 1 0 1 Shift
= (1 0 0 1 1 0 0 0 0 1)2 1 0 0 1 0 0 0 1 1 0 1 A = A + M Step 3
0 1 0 0 1 0 0 0 1 1 0 Shift
0 1 0 0 1 0 0 0 1 1 0 A = A + 0 Step 4
0 0 1 0 0 1 0 0 0 1 1 Shift
1 0 0 1 1 0 0 0 0 1 1 A = A + M Step 5
0 1 0 0 1 1 0 0 0 0 1 Shift
54 2
27
02/09/17
C A Q Q0
Carry
out n-bit registers
ADDER
M Control Unit ..
.
MUX
55 2
END OF LECTURE 35
56 2
28
02/09/17
57
2
Signed Mul)plica)on
• We can extend the basic shiZ-and-add mul7plica7on method to
handle signed numbers.
• One important difference:
– Required to sign-extend all the par7al products before they are added.
– Recall that for 2’s complement representa7on, sign extension can be
done by replica7ng the sign bit any number of 7mes.
0101 = 0000 0101 = 0000 0000 0000 0101 = 0000 0000 0000 0000 0000 0000 0000 0101
1011 = 1111 1011 = 1111 1111 1111 1011 = 1111 1111 1111 1111 1111 1111 1111 1011
58 2
29
02/09/17
59 2
60 2
30
02/09/17
61 2
62 2
31
02/09/17
START
A = 0; Q -1 = 0 M: n-bit mul7plicand
COUNT = n;
M = mul7plicand; Q: n-bit mul7plier
Q = mul7plier;
A: n-bit temporary register
01 10
Q0Q -1
Q -1: 1-bit flip-flop
A=A+M 00 or A = A – M
11
Arithme7c ShiZ Right (A, Q , Q -1) Skips over consecu)ve 0’s
COUNT = COUNT – 1; and 1’s of the mul)plier Q.
COUNT = 0? STOP
63 2
A Q Q -1
Example 1: (-10) x (13)
0 0 0 0 0 0 1 1 0 1 0 Initialization
Assume 5-bit numbers.
0 1 0 1 0 0 1 1 0 1 0 A = A - M Step 1
M: (1 0 1 1 0)2 0 0 1 0 1 0 0 1 1 0 1 Shift
-M: (0 1 0 1 0)2
1 1 0 1 1 0 0 1 1 0 1 A = A + M Step 2
Q: (0 1 1 0 1)2 1 1 1 0 1 1 0 0 1 1 0 Shift
Product = -130 0 0 1 1 1 1 0 0 1 1 0 A = A - M Step 3
= (1 1 0 1 1 1 1 1 1 0)2 0 0 0 1 1 1 1 0 0 1 1 Shift
0 0 0 0 1 1 1 1 1 0 1 Shift Step 4
1 0 1 1 1 1 1 1 0 0 1 A = A + M
1 1 0 1 1 1 1 1 1 0 0 Shift Step 5
64 2
32
02/09/17
A Q Q -1
Example 2:
0 0 0 0 0 0 0 1 1 1 0 0 0 Initialization
(-31) x (28)
0 0 0 0 0 0 0 0 1 1 1 0 0 Shift Step 1
Assume 6-bit numbers.
M: (1 0 0 0 0 1)2
0 0 0 0 0 0 0 0 0 1 1 1 0 Shift Step 2
-M: (0 1 1 1 1 1)2 0 1 1 1 1 1 0 0 0 1 1 1 0 A = A - M
Step 3
Q: (0 1 1 1 0 0)2 0 0 1 1 1 1 1 0 0 0 1 1 1 Shift
65 2
Arithme7c
shiZ right Data Path for Booth’s Algorithm
An-1 A Q Q0 Q- -1
n-bit registers
SUBTRACT
ADD /
Control Unit ..
M .
Add / Subtract
66 2
33
02/09/17
67 2
68 2
34
02/09/17
Original: Multiplier -- 1 0 1 0 1 0
Booth: Multiplier -- -1 +1 -1 +1 -1 0
Recoded: Multiplier -- 0 -1 0 -1 0 -2
0 0 1 1 0 1
. -1 . -1 . -2 • M = 0 0 1 1 0 1 (+13)
--------------------------
• -1 * M = 1 1 0 0 1 1
1 1 1 1 1 1 1 0 0 1 1 0
1 1 1 1 1 1 0 0 1 1
• -2 * M = 1 0 0 1 1 0
1 1 1 1 0 0 1 1
--------------------------
1 1 0 1 1 1 1 0 0 0 1 0
69 2
70 2
35
02/09/17
4 x 4 Carry
Save
Mul)plier
71 2
72 2
36
02/09/17
73 2
END OF LECTURE 36
74 2
37
02/09/17
PROF.
DR. KAMALIKA DATTA
DR.INDRANIL
KAMALIKASENGUPTA
DATTA
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, NIT
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, IIT MEGHALAYA
KHARAGPUR
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, NIT MEGHALAYA
75
2
Introduc)on
• Division is more complex Instruc)on Latency Cycles /
than mul7plica7on. Issue
• Example: Typical values in Load / Store 3 1
Pen7um-3 processor.
Integer Mul7ply 4 1
– Not easy to construct
high-speed dividers. Integer Divide 36 36
• The ra7os have not Floa7ng-point Add 3 1
changed much in later Floa7ng-point Mul7ply 5 2
processors. Floa7ng-point Divide 38 38
76 2
38
02/09/17
• Latency:
– Minimum delay aZer which the first result is obtained, star7ng from the
7me when the first set of inputs is applied.
• Cycles/Issue:
– Whenever a new set of inputs is applied to a func7onal unit (e.g. adder),
it is called an issue.
– Pipelined implementa7on of arithme7c unit reduces the number of clock
cycles between successive issues.
– For non-pipelined arithme7c units (e.g. divider), the number of clock
cycles between successive issues is much higher.
• Next input can be applied only aZer the previous opera7on is complete.
77 2
78 2
39
02/09/17
79 2
80 2
40
02/09/17
• Machine implementa7on:
– For hardware implementa7on, it is more convenient to shiZ the
par7al remainder to the leZ rela7ve to a fixed divisor; thus
Ri+1 = 2Ri - Qi.M (instead of Ri+1 = Ri - Qi.2-i.M)
– The final par7al remainder is the required remainder shiZed to the
leZ, so that R = 2-3.R4 (see next slide).
81 2
Divisor M Quotient Q
1 1 0 1 0 0 1 0 1 Dividend = 2R0
1 1 0 Q0.M 0
Do not ---------------
subtract 1 0 0 1 0 1 R1
1 0 0 1 0 1 0 2R1
1 1 0 Q1.M 0 1
---------------
D = 37 = (1 0 0 1 0 1)2 0 1 1 0 1 0 R2
M = 6 = (1 1 0)2 0 1 1 0 1 0 0 2R2
1 1 0 Q2.M 0 1 1
Quo)ent Q = 6 ---------------
Remainder R = 1 0 0 0 1 0 0 R3
0 0 0 1 0 0 0 2R3
1 1 0 Q3.M 0 1 1 0
---------------
0 0 1 0 0 0 R4 = 23.R
82 2
41
02/09/17
Control Unit ..
Divisor M .
Add / Subtract
83 2
Basic Steps
Repeat the following steps n 7mes:
a) ShiZ the dividend one bit at a 7me star7ng into register A.
b) Subtract the divisor M from this register A (trial subtrac5on).
c) If the result is nega7ve (i.e. not going):
• Add the divisor M back into the register A (i.e. restoring back).
• Record 0 as the next quo7ent bit.
d) If the result is posi7ve: A Q
• Do not restore the intermediate result.
• Record 1 as the next quo7ent bit. M
84 2
42
02/09/17
START
Restoring Division
A = 0; M = divisor; • Quo7ent in Q
Q = dividend; COUNT = n
• Remainder in A
ShiZ leZ A, Q
A Q
A=A–M Trial subtrac5on
No Yes M
A –ve ?
Q0=1 Q0=0
A=A+M Restora5on No
Yes
COUNT = COUNT – 1 COUNT = 0? STOP
85 2
• Analysis:
– For n-bit divisor and n-bit dividend, we iterate n 7mes.
– Number of trial subtrac7ons: n
– Number of restoring addi7ons: n/2 on the average
• Best case: 0
• Worst case: n
86 2
43
02/09/17
87 2
Non-Restoring Division A Q
M
• The performance of restoring division algorithm can be improved by
exploi7ng the following observa7on.
• In restoring division, what we do actually is: ShiZ leZ means
– If A is posi7ve, we shiZ it leZ and subtract M. mul7plying by 2.
• That is, we compute 2A – M.
– If A is nega7ve, we restore is by doing A+M, shiZ it leZ, and then subtract M.
• That is, we compute 2(A + M) – M = 2A + M.
• We can accordingly modify the basic division algorithm by elimina7ng the
restoring step à NON-RESTORING DIVISION.
88 2
44
02/09/17
89 2
89
START
Non-Restoring Division A Q
A = 0; M = divisor;
Q = dividend; COUNT = n M
No A -ve ? Yes No
ShiZ leZ A, Q ShiZ leZ A, Q COUNT = 0?
A=A–M A=A+M
Yes
No
No Yes A<0?
Q0 = 1 A -ve ? Q0 = 0
Yes
90 2
45
02/09/17
91 2
Control Unit ..
M .
Add / Subtract
92 2
46
02/09/17
93 2
END OF LECTURE 37
94 2
47