0 Up votes0 Down votes

4 views42 pagesInt Add and Mult

Oct 21, 2013

© Attribution Non-Commercial (BY-NC)

PDF, TXT or read online from Scribd

Int Add and Mult

Attribution Non-Commercial (BY-NC)

4 views

Int Add and Mult

Attribution Non-Commercial (BY-NC)

- Steve Jobs
- Wheel of Time
- NIV, Holy Bible, eBook
- NIV, Holy Bible, eBook, Red Letter Edition
- Cryptonomicon
- The Woman Who Smashed Codes: A True Story of Love, Spies, and the Unlikely Heroine who Outwitted America's Enemies
- Contagious: Why Things Catch On
- Crossing the Chasm: Marketing and Selling Technology Project
- Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
- Zero to One: Notes on Start-ups, or How to Build the Future
- Console Wars: Sega, Nintendo, and the Battle that Defined a Generation
- Dust: Scarpetta (Book 21)
- Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Crushing It!: How Great Entrepreneurs Build Their Business and Influence—and How You Can, Too
- Make Time: How to Focus on What Matters Every Day
- Algorithms to Live By: The Computer Science of Human Decisions
- Wild Cards

You are on page 1of 42

Instructor: Shantanu Dutt Department of Electrical and Computer Engineering University of Illinois at Chicago

X i Yi

Ci+1

FA i

Ci Si

, where

x7 y7 x6 y6 x5 y5 x4 y4 x3 y3 x2 y2 x1 y1 x0 y0

cout FA6 c7 S6 S5 S4 S3 c6 c5 c4 S7

FA6

FA5

FA4

FA3

c3

FA2

c2 S2

FA1

c1 S1

FA0

c0 S0

Problem: Delay is gate delays or each FA has a 2-gate delay. Thus is each gate has a delay of 2ns, delay for a 32-bit RCA is 64ns.

Overow in Addition

Overow occurs when the result of the operation does not t in the representation being used For example, if 4-bit unsigned numbers 6 = and 12 = are added the sum (18) overows since its binary equivalent does not t in 4 bits Overow is detected for unsigned addition when the carry out of the nal Full Adder is 1 For 2s complement representation of signed numbers overow occurs when the carry into the MSB (most signicant bit), which is also the sign bit, is different from the carry out of that bit. The carry out of the MSB bit representation always represents the sign bit of the sign-extended of the sum

! ! " ! ! !

"

"

ADDERS (Contd.) Speeding up addition A faster adder: The traditional carry-lookahead adder (CLA): Dene two extra functions:

$ %

X i Yi

FA i

is the generate bit, which is 1 only if a carry out is to be generated irrespective of the input carry. This obviously is the case only when

!

is the propagate bit, which is 1 only if the output carry is to be the same . This will be the case only when either as the input carry, i.e., or (but not both) is 1 Thus the carry out of the th stage can be expressed as:

% '

&

Consider a 4-bit adder: Note that and can be generated in constant time (specically, 1 gate delay) by each modied full adder (MFA).

$ % $( %( ( "

$

%

$

%

$

%

$(

$(

%

%(

)

"

"

"

0

)

"

)

)

"

%( % % % % % %

X 3 Y3

X 2 Y2

X 1 Y1

X 0 Y0

C4 S3

FA 3

C3 S2 P3 G3

FA 2

C2 S1 P2 G2

FA 1

FA 0

C0

P0

G0

In this CLA the s use 2-level logic and thus have a delay of 3 gate delays (1 for each , and 2 for the s), as opposed to gate delays for a 4-bit RCA.

1 2

Disadvantage: For a 16-bit adder, for example, we cannot go on generating the s in this manner, since the hardware becomess execssive and messy A 16-bit adder can be partitioned into groups of 4 4-bit CLA adders, with the inter-group carries rippling through the four groups:

X 74 4

X 30 4

C0 4 S 30

Such a 16-bit adder has a delay of 12 gate delays (= 24ns for a gate delayof = 32 gate delays (= = 64ns) in 2ns) as opposed to a delay of a 16-bit RCA In general for an -bit group-CLA adder with 4-bit CLA cells, the delay gate delays as opposed to gate delays for the ripple-carry adder is

1 1 ! 4 2 5 6 5

7

A note of caution: -input gates have greater delays than 2-input gates for . We had assumed that for a 4-bit CLA the 5-input OR and AND gates have the same delay as a 2-input gate. The delay of a 5-input gate will be greater because:

7 8

Vdd A Vdd B AB A Gnd B AB A B

Vdd A C D B E A+B+C+D+E A B

Let be the max. of the switching on and the switching off time of a MOS transistor, i.e., the max. of the time for charge to collect at the channel or the time for the charge to be removed from the channel Let be the resistance of a channel, and the input gate capacitance of a transistor Then, a 2-i/p AND/OR gate has a delay of

@ 9 A B 9 A 5

9 A 4 @ B

while a 5-i/p AND/OR ckt. formed of cascaded 2-i/p gates will have a delay of

9 2 A C @ B

Thus, while a 5-i/p AND/OR gate will be slower than a 2-i/p AND/OR gate, it will be faster than a cascaded implementation of a 5-i/p AND/OR ckt. The latter is primarily because more transistors switch on in parallel in a 5-i/p AND/OR gate than in a 5-i/p AND/OR ckt.

@

10

In a 4-bit CLA adder, we are essentally replacing a series of 2-i/p cascaded (multi-level) logic by a multiple-input 2-level logic Note that the delay of an -bit group-CLA adder with 4-bit cells will be somewhat more than 2-i/p gate delays.

5 6 2

10

11

Faster adders

Can we do better than a delay that is linear in the number of bits ? Yes, by using a carry-select adder ( time), OR by using a parallel prex circuit to generate all carries in time Carry-Select adder:

D FE G H

x n1 , y n1

, y

Add in

0 c

out

Add

0 in

out

Add

0 in

out

Add

in

Add in

out

Add

1 in

out

Add

1 in

Mux

Mux

Mux

s n

Q

D 2

D 2

PI 6 I 2

11

12

' TS S

For the th full adder, dene the symbols : kill incoming carry (when ) : propagate incoming carry (when ) : generate a carry (when ) We can encode these symbols as and call these pair of bits . Each FA can produce in constant time As a matter of fact, , where and are the generate and propagate bits of a FA discussed earlier Dene operator as:

! ! ! ! R % U U U % $ V $ % $ !

!

R U %

R % R R R % $ $ $ $ $ R $

Note that is an associative operator, i.e., Also, is not a commutative operator, i.e.,

V V

V XW Y W Y V Y ` WV W Y

12

13

a ' Ved V a d cb d U( U U V Ved # b V d # b " U( U U V d

'

If we can compute each quickly, then we can obtain the carry-in the th FA as follows: If then else if then else if then The s can be computed in constant time after the s are available

a ' ! a & R a a & % $ ! ( a

of

13

14

gf a a V d cb V d a U U gf U fih V g

a f ( a V a gf

.

a a h V b

k k1

gf

14

15

gf a

We can use the above property to form 1-level s by combining 2 adjacent s, then 2-level s by combining two adjacent 1-level s, etc. This yields a tree-structured circuit with a logic at every node; this ckt. gives us only those s for which for some

a U gf ' q Ip a ! V

gf

15

16

a

r 7,0 r 3,0 3,0 p r 7,4 r

!!

Legend: x x

!!

r7,6 r 5,0 1,0 5,4 r 1,0 r r3,2 r

!! !! c ! a

q 0

! a y b

!!

q 7 r r r 6,0 q q 4,0 q 2,0 q q q 3 2 1 6 5 4

!! !

!!

E G H

This circuit is called a parallel prex circuit, and can be used to obtain the prexes of any associative operation (like AND, OR, addition, multiplication, etc.) The delay is -logic steps, steps to go up the tree and steps to come down Extra hardware used: logic units VLSI area reqd.: height of tree is , width of tree is

I E ! V G E G H I H G ! ts u I# I H I v ! ! ! I ! E G H V

16

17

The parallel-prex CLA adder (contd.) Another VLSI implementation of a parallel prex tree:

q n1 q 2 0 q

i+1

i+1,i

r n1

17

18

C out

Cin C

15

q15

C 14 S 2 2 S

q 14

q2

2 S

C2

q1

2 S

C1

q0

2 S

FA

FA

FA

FA

FA

X Y

15 15

X Y

14 14

X2 Y

X 1Y

X Y

0 0

18

19

I I w x w x

Subtraction can be done using an adder, since This means that , which we assume is in 2s complement notation, has to be negated. A 2s complement number is negated by complementing it and adding a 1, i.e., The augmentation to an adder to perform subtraction is shown below:

x I x x y

Y n n n 1 nbit Adder

Cout n

Cin

19

20

COMPUTER ARITHMETIC MULTIPLIERS Serial Multiplication Add-and-shift (A&S) multiplication: Manual Example:

If the additions are done one at a time, we obtain a sequence of partial products

( e #

21

(

is obtained as

d

x

Thus

# b d d # b # x x x d (

where

w x

the multiplicand

21

22

The same effect as shifting the multiplicand left ( ) can be achieved by keeping the multiplicand xed at the left-most position and shifting the partial product right. Example:

x

b ( # x

However,

is the same:

#

# b

#

22

23

C out Reg. Multiplier X Accumulator 16 16 And 16 C out 1 16 Addandshift multiplication for unsigned numbers 16bit Adder 1 Q 1

Multiplicand Y M

If LSB of Q is 1 then AC = AC+M else AC = AC; Shift -AC-Q register combination right by 1 bit

B

Final product is in AC-Q register NOTE: Overows are tolerated in the additions AC when right shifting

c Shantanu Dutt, UIC

B

23

24

Assumption: Both and are in their 2s complement representation Method 1: If multiplier is -ve get its 2s complement so that it becomes positive, i.e. If multiplicand is -ve get its 2s complement so that it becomes positive, i.e. Multiply If exactly one of and was negative, get s 2s complement so that it becomes negative, i.e. Disadvantage: Preprocessing and postprocessing can take up to 4 clock cycles (ccs)

w w I w w x I x w x w x x x I

24

25

Method 2: When the multiplier is +ve perform taking care to do the following when each is shifted right: 1. When there is no overow in the addition (recall the condition for overow for 2s complement addition), an arithmetic right shift of register AC.Q is performed without shifting in into the MSB of AC 2. Arithmetic right shift: MSB is sign-extended, i.e., if the MSB (sign bit) is 1 a 1 is shifted into the MSB of AC, otherwise a 0 is shifted in 3. If there is an overow, then as in the unsigned case, shift into MSB of AC when shifting AC.Q right. This works because in this case the bit output of the adder, where is the MSB, is the exact 2s complement representation of the sum. Check this by sign extending the inputs to bits and compute the sumthe output will be the same as for the -bit inputs, but without overow out of the th bit

w w B x B ! ! !

25

26

26

27

Method 2 (contd.): If the multiplier is negative, perform the rst additions as explained above, and then subtract as the nal step This works because the value of a 2s complement number is given by

I w x ! # b ( # b ( I # b w # b "

I w x

# b

. When

# b # b ( x " ( I x x d # b

is negative,

w

# b

! w

"

I

# b

27

28

w w ed Xf x

w f

is given by

w I # # b ( w I # b "

# b I I w x

# b ( # b x ( I # b "

# b

"

28

29

1 Accumulator AC[0] Logic 1 1 If ovfl then AC[0] : output = C out Logic else output = AC[0] 16 Ovfl. det. C 15 C out 1 16bit Adder 1 16 And 16 Multiplier X Q 1

Multiplicand Y M

16

29

30

Speeding Up Serial Multiplication Booths Algorithm Idea: Consider the following substring of

gkj gi gh h h h h gh g g g o mj o i m lm gh h h h n h g g g g g g gh g g n g I x ! o x x w w g n g g g g g w

Thus instead of adding and shifting 4 times (corresponding to the string of 4 1s in 011110), we can subtract ( ) when we see the 1st 1 coming after a 0 in , i.e., we detect the 2-bit substring 10 in the last 2 bits of the current , just shift 4 times and then add ( ) when we see that the current string of 1s in have ended, i.e., we detect the 2-bit substring 01 in the last 2 bits of the current . This saves us two adds Thus when the multiplier contains long (greater than length 2) strings of 1s, Booths multiplication is faster

x w w

!

30

31

w

Booths multiplication also takes care of a negative multiplier automatically. Consider the following :

j p q r i g j gi h h h h gh g g g g g g g g g g g

Since the multiplication algorithm contains steps ( in the above example), the 1000000 is ignored and we end up subtracting times the multiplicandexactly the right answer!

4

3

31

32

Booths algorithm is decribed by the following table for iteration : Bit Bit Explanation (current) (prev.) 1 0

I ' ' !

Action

I x

Ts x !

1 0 0

1 1 0

Beginning of a run of 1s Middle of a run of 1s Only shift 0 End of a run of 1s 1 Middle of a run of 0s Only shift 0

Note: (1) For unsigned multiplication, we need to pad the multiplier with mythical 0s on both sides (right of LSB padding required to start off the process) (2) For 2s complement multiplication, we need to pad the multiplier with a mythical 0 only to the right of its LSB. This works because (for 2s complement, the last run of 1s is 11. . . 1, where the leftmost 1 is the sign bit (bit ) and suppose the rightmost 1 is the th bit from left, then the value of this sequence is , which is exactly the value we alloted to this sequence when we subtracted at the th bit position at the

I ! I# I g I g # b ' x

'

33

beginning of this last run of 1s. Further, if is the value we alloted to the rest of the multiplier before the last run of 1s, then the nal value we give to the multiplier is , which is its correct value in 2s complement: . (3) is the th bit of the Booth Recoding of : (4) means no arithmetic operation, means add , means subtract

I I# t g t g # b u ws v t ' w t xs ! x

Hardware: Excercise

33

34

34

35

Problem: When the multiplier contains long strings (say, of length ) of alternating 1s and 0s ( ), then we perform additions and subtractions using Booths algorithm compared to only additions using regular add-and-shift Solution: Look at 3 consecutive bits of the multiplier instead of 2 to decide what to do. This is called the Modied Booths Algorithm (MBA) This will enable us to treat isolated 1s and 0s differently from runs of 1s and 0s.

! ! ! ! 7 6 6

35

36

gh x x g

we add corresponding to the isolated 1. This is correct, since in BA we would have subtracted a on detecting 10 and added on subsequently detecting 01. Thus assuming the 1 is the th bit, we would have added we are doing the RHS in MBA

I x x x h gh ' x

When we see a

and this isolated 0 is not following an isolated 1, then we subtract corresponding to the isolated 0. This is correct, since in BA we would on detecting 01 and then subtracted on detecting 10. have added This is equivalent to adding again, we are doing the RHS in MBA

x I x x I x x

36

37

w

After detecting an isolated 1 (010), it should be noted as such so that after shifting right, we dont misinterpret the bit pattern as ending a run of 1s. For example, consider two bit patterns showing the 4 consecutive bits of

! z y { ! x ! z y { ! !

The 1st has an isolated 1 and the 2nd a run of 1s. For the 1st case we have added corresponding to the 1, and in the second case, we do not do anything as we are in the middle of a run of 1s (as in BA) After a right shift, we have the patterns

which are identical. In the rst case, we need to have noted that the 1 corresponds to an isolated 1, so that we do not do anything. In the second case, this means end of a run of 1s, and so we need to add (as in BA). These cases are distinguished by setting a latch to be 0 when an isolated 1 is spotted, and to 1 if a run of 1s is spotted Thus actually the least signicant of the 3 bits that we observe should be and not the previous bit of . Except when distinguishing between an

| w | x

38

isolated 1 (0) and the end of a run of 1s (0s) will be the same as the previously observed bit of . Thus after a right shift, the above 3-bit patterns will be

w z y | { | ! |

38

39

Similarly, we need to distinguish between an isolated 0 and the end of a run of 0s. In the former case, is set to 1 and to 0 in the latter case Again, consider two bit patterns showing 4 consecutive bits of

| ! ! ! z y { ! !

The 1st has an isolated 0 in its 2nd bit for which we subtracted and the 2nd pattern has a 0 in its 2nd bit that ends a run of 0s. Thus after a right shift, the above 3-bit patterns will not be identical, but will be

! ! ! z y | { ! ! |

In the rst case, we correctly do nothing (we already subtracted corresponding to the isolated 0) corresponding to the middle bit and in the second we subtract , since the middle bit begins a run of 1s which has not yet been accounted for.

x

39

40

The rightmost bit in the 3 bits that we are looking at is actually which is initialized to 0 The second bit is (in the th iteration, ), and the leftmost bit is Note that except in the isolated 1/0 case, (as in BA), otherwise

I ' } ' } | cb y | cb !

40

41

We thus have the following Modied Booths Algorithm described for iteration , :

I ' } ' }

Bit (next) 0 0 1

~

Bit (current) 0 1 0 1 0 1 0 1 0 1 1 1 1

Explanation

New 0 0 0 1 0 1 1 1

~

1 0 0 1 1

0 Middle of a run of 0s 0 Isolated 1 0 Isolated 0 following an isolated 1 OR Middle of a run of 0s Begins a run of 1s Begins a run of 0s Middle of a run of 1s Isolated 0 following a run of 1s Middle of a run of 1s

NOTE: (1) The multiplier needs to be padded by mythical 0s on both sides for unsigned and 2s complement multiplication. (2) is the th bit of the Modied Booth Recoding of (3) This signed-digit encoding has 0s on the average, as opposed to in the regular binary code. Thus fewer arithmetic operations are required on the average using MBA for multiplication.

xs ' w 6 5

0 1 0

1 0 0

41

42

42

- Pan Os 6.1 Cli RefUploaded byAnubhaw Kumar
- Design of Improved Array Multiplier ByUploaded byiaetsdiaetsd
- ITU - Digital Circuits Course Slide No. 01Uploaded byHaydar Şahin
- Lect 18 - Binary Addition and SubtractionUploaded byvbpanchal
- 680 179 02 Apollo Discovery SoundersUploaded byvlaya1984
- 100203287 Project ReportUploaded byShubham Gupta
- dept_34_lv_3_14909.pdfUploaded byMinh Nam
- Vedic Ppt ReviewUploaded bytarun
- CS_F111_1008Uploaded byroshanscientistgotra
- TMP89FM82TDUG_errata_en_20170326.pdfUploaded byAbhishek Saini
- Low power BISTUploaded byHitesh Pradhan
- B.tech VLSI Major 2018-2019 Projects ListUploaded byvishwas
- Modification Kossel Plus 20161122Uploaded byLeonel Dorronzoro
- Geeta Unit-10valve-13th Oct WdtlUploaded byjayanta37
- Digital Instrumentaion GATE IES PSU Study MaterialsUploaded byCharlotte Dunken
- Towards a General Framework for FPGA Based Image Processing ParCoUploaded bymussadaqhussain8210
- Basic Flip FlopsUploaded bydgpride
- 66Uploaded byDebi Prasad Dash
- ES ZC261-L4Uploaded bySeshu Bollineni
- Boost Converter Using ArduinoUploaded byarwin
- LogUploaded byAbdurrahman Abu Hilmi
- Lab ManualUploaded byMohd Shah
- Scp 1000Uploaded bybuzzo182
- SCADAUploaded bySelva Ganapathy
- Crio Ul Ex Hazloc New CertUploaded byFraFraFra87
- GPS InstructionsUploaded bynishant mamgain
- popupUploaded byRandy Anandhita
- Overflow in cUploaded byjlmansilla
- Resume_CID200003001392897Uploaded byDirga Maulana Muhammad
- ATPROCEDURER23Uploaded byManik Jain

- Storage StructuresUploaded bymanishbhardwaj8131
- SegmentationUploaded bymanishbhardwaj8131
- SchedulingUploaded bymanishbhardwaj8131
- Scheduling (2)Uploaded bymanishbhardwaj8131
- Process ManagementUploaded bymanishbhardwaj8131
- SynchronizationUploaded bymanishbhardwaj8131
- Memory ManagementUploaded byRajnish Devgan
- Main Memory2Uploaded bymanishbhardwaj8131
- IpcUploaded byapi-3697260
- introUploaded bymanishbhardwaj8131
- Deadlock2Uploaded bymanishbhardwaj8131
- PDF DeadlockUploaded bymanishbhardwaj8131
- Mutual ExclusionUploaded bymanishbhardwaj8131
- Memory ManagementUploaded bymanishbhardwaj8131
- Magnetic DisksUploaded bymanishbhardwaj8131
- LectureCA All SlidesUploaded bymanishbhardwaj8131
- Associative memoryUploaded bymanishbhardwaj8131
- File System ManagementUploaded bymanishbhardwaj8131
- Early Computer SystemsUploaded bymanishbhardwaj8131
- DeadlocksUploaded bymanishbhardwaj8131
- Deadlock Mem Management Os TypesUploaded bymcaash
- CPU SchdulingUploaded bymanishbhardwaj8131
- Computer System OperationUploaded bymanishbhardwaj8131
- Ch15 SecurityUploaded bymanishbhardwaj8131
- Ch14 ProtectionUploaded bymanishbhardwaj8131
- CachesUploaded bymanishbhardwaj8131
- Physical Database DesignUploaded bymanishbhardwaj8131
- dbmsUploaded byRihan Mohammed
- Crash RecoveryUploaded bymanishbhardwaj8131
- SecurityUploaded byapi-3749180

- A Parallel Ant Colony Optimization Algorithm for the Salesman ProblemUploaded byVitor Curtis
- i2ml3e-chap9.pptxUploaded byvarun3dec1
- Lexical AnalyzerUploaded byPooja Jain
- Segment Tree.(e Maxx)Uploaded byShubham
- Ch04 RecurrencesUploaded byKumar Siva
- DSP-3Uploaded byPrakasam Arulappan
- NP (complexity).pdfUploaded byhongnh-1
- DSchap03Uploaded byMuhammad Hashim
- Context Free GrammarUploaded byhardnut
- Ch2Uploaded byAbdul Rehman
- Optimization in ScilabUploaded byMajor Bill Smith
- Daa Course File Final 2012Uploaded bygurusodhii
- Exhaustive SearchUploaded byShashank Karve
- 10.1.1.54Uploaded byshuvro sarker
- Timus SourcesUploaded byGansukh Batjargal
- Compiler Notes Kcg Unit VUploaded bysridharanc23
- Single valued Neutrosophic clustering algorithm Based on Tsallis Entropy MaximizationUploaded byMia Amalia
- D1 Networks (Dijkstra)Uploaded byNadien DieSya
- 37416073-Booth-Multiplier-on-23-06-10Uploaded byMahesh Krishna
- EXTRA Flta Dfa QuestionsUploaded byD'vonJackson
- DSA NOV-DEC 2010Uploaded byArivazhaganp_mca
- compiler design unit 2Uploaded byArunkumar Panneerselvam
- Social Network AnalysisUploaded byLorraine D'Almeida
- VHDL Math Tricks.pdfUploaded byGuillermo Munoz Arzate
- תכנון מיקרו מעבדים- הרצאה 2Uploaded byRon
- Complexity Lecture NotesUploaded byAnik Mukherjee
- CS540-2-lecture2Uploaded byhendurank
- Pajek Large Networks PaperUploaded bydogajunk
- Number SystemUploaded byprabhumalu
- Marcela ReynaUploaded byMarcela Reyna

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.