You are on page 1of 43

Institute of

Applied Microelectronics & Computer Engineering

Selected Topics of VLSI Design

Prof. Dr.-Ing. Dirk Timmermann


dirk.timmermann@uni-rostock.de
Please note this name change

Selected
Advanced
Module Topics of
VLSI Design
VLSI Design

Short name „Chip project" "HW-Alg."


Semester summer winter
Until
SWS 1 1/1/1
2016
hardware
VLSI chip project
Content algorithms

Short name "Chip project" "HW-Alg."


Semester winter summer
Starting SWS 1 1/1/1
2017 ETCS 6 6
hardware
VLSI chip project
Content algorithms

3/28/2018 Selected Topics of VLSI Design 2


Organization
● Lecture: Hardware oriented arithmetic algorithms and
cryptography
● Exercise: Algorithms, building blocks, VHDL coding
● Lab: during project week
● Schedule: lecture, Monday 15:xx – 16:yy
exercise, replaces lectures in 2nd half of semester
mandatory lab with attendance list: 22.5.-23.5. 9:00
● Location: Warnemuende, building 1, R 1226

3/28/2018 Selected Topics of VLSI Design 3


Literature
Textbooks
● Parhami, B.: Computer Arithmetic, Algorithms and Hardware Designs,
2nd edition, Oxford University Press, New York, 2010.
● Koren, I.: Computer Arithmetic Algorithms, 2002
● Muller, J.M.: Elementary Functions, Algorithms and Implementation,
2nd ed., 2006
● Klar, H., Noll, T.: Integrierte Digitale Schaltungen, Springer 2015, free
access from URO network
● Pirsch, P.: Architekturen der digitalen Signalverarbeitung B.G. Teubner,
Stuttgart, 1996

Courses and Websites


● Koren, I.: Computer arithmetic- Simulator
● Ercegovac, M.: Course Digital Arithmetic
● Guyot, A. : Educational Applets
● Strey, A.: Course Computer-Arithmetik
28.03.2018 Selected Topics of VLSI Design 4
Institute of
Applied Microelectronics & Computer Engineering

Selected Topics of VLSI Design

Part 1: Number Systems

Prof. Dr.-Ing. Dirk Timmermann


dirk.timmermann@uni-rostock.de
Outline
● 1.1 Positional / Place-Value Notation of Numbers
o Representation of Integer Numbers, Real Numbers and Radix Selection
● 1.2 Signed Number Representations
o Sign Magnitude, (r-1)-Complement, r-Complement and Redundant Binary
● 1.3 Rounding
o via Truncation, Round-to-Nearest and Round-to-Nearest-Even
● 1.4 Overflows
o in (r-1)-Complement, Carry-Save and Signed Redundant Binary Numbers
o Overflow Detection and Handling
● 1.5 Basic Operations
● 1.6 Cost/Performance Estimation Basics

3/28/2018 Selected Topics of VLSI Design 6


1.1 Positional / Place-Value Notation of Numbers
● The number A is represented by n digits ai and a defined base/radix r

o Binary àr=2
o Ternary àr=3
o Octal àr=8 𝒂𝒂𝒊𝒊 ∈ 𝟎𝟎, 𝟏𝟏, … , 𝒓𝒓 − 𝟏𝟏
o Decimal à r = 10
o Hexadecimal à r = 16

● The value V(A) of the number A is given by the sum of the n partial
products pi for each of its positions

● The partial product pi = ai ∙ri results from the multiplication of the digit ai
with its weight ri, which is a power of the radix r and determined by the
position index i

3/28/2018 Selected Topics of VLSI Design 7


1.1 Positional / Place-Value Notation of Numbers
● Integer number A with n digits Value V(A)
𝑛𝑛−1

𝐴𝐴 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 𝑉𝑉 𝐴𝐴 = � 𝑎𝑎𝑖𝑖 � 𝑟𝑟 𝑖𝑖


𝑖𝑖=0

● A positive integer number A has a range of: 0 ≤ V 𝐴𝐴 < 𝑟𝑟 𝑛𝑛

● Real numbers contain n digits for the integer part and m digits for the
fractional part

● Real number A with n+m digits Value V(A)

𝑛𝑛−1

𝐴𝐴 = 𝑎𝑎𝑛𝑛−1 … 𝑎𝑎0 . 𝑎𝑎−1 … 𝑎𝑎−𝑚𝑚 𝑉𝑉 𝐴𝐴 = � 𝑎𝑎𝑖𝑖 � 𝑟𝑟 𝑖𝑖


𝑖𝑖=−𝑚𝑚

● A positive real number A has a range of: 0 ≤ V 𝐴𝐴 < 𝑟𝑟 𝑛𝑛

3/28/2018 Selected Topics of VLSI Design 8


1.1 Positional / Place-Value Notation of Numbers
● In computation two formats for the approximation of real number exist
● Fixed point numbers à the number of significant digits before and
after the decimal point is fixed (as seen above)
o decimal point is fixed and never explicitly represented in hardware
o its position is defined during design and must be known to interpret
the number
● Floating point numbers à The number of significant digits before
and after the decimal point depends on exponent

𝑉𝑉(𝐴𝐴) = −1 𝑺𝑺 � 𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 � 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆

● Floating point numbers in modern computer à IEEE 754 standard


o Binary half precision à 16 bit data words(= 1 + 10 + 5 bit)
o Binary single precision à 32 bit data words(= 1 + 23 + 8 bit)
o Binary double precision à 64 bit data words(= 1 + 52 + 11 bit)
o …
3/28/2018 Selected Topics of VLSI Design 9
1.1 Positional / Place-Value Notation of Numbers
● Radix Selection (cont’d from fixed point numbers)
o Computations are performed in circuits
o Binary representation (r = 2) is best representation for physical
signal levels in most logic
 Voltage U: { 0 , 1 } à { VSS , VDD }
 Current I : { 0 , 1 } à { IMIN , IMAX }

● Efficiency: How many bits do we need in a bit-oriented (r = 2) memory


to store a positive number V ?
Memory
[bit]

3
#𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 = 𝑙𝑙𝑙𝑙𝑙𝑙2 𝑉𝑉 + 1
2
1

1 2 3 4 5 6 7 V
28.03.2018 Selected Topics of VLSI Design 10
1.2 Signed Number Representations
● Positional notation as discussed above only covers positive numbers
● For negative number different signed number representations (SNRs)
options exist
● SNR #1: Sign Magnitude (SM)
o Insert sign bit at an-1 before magnitude of number
o Positive number à an-1 = 0 and Negative number à an-1 = 1

𝑛𝑛−2
𝐴𝐴+ = 0 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0
V 𝐴𝐴𝑆𝑆𝑆𝑆 = −1 𝑎𝑎𝑛𝑛−1 � � 𝑎𝑎𝑖𝑖 � 𝑟𝑟 𝑖𝑖
𝐴𝐴− = 𝑟𝑟 − 1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0
𝑖𝑖=0

● A signed integer number A has a range of:


−𝑟𝑟 𝑛𝑛−1 < V 𝐴𝐴𝑆𝑆𝑆𝑆 < 𝑟𝑟 𝑛𝑛−1
● + Symmetrical range
● - Double representation of zero, requires different treatment of positive
and negative numbers in arithmetic circuits
3/28/2018 Selected Topics of VLSI Design 11
1.2 Signed Number Representations
● SNR #2: (𝒓𝒓-1)-complement à 1‘s complement
o Negative number results from complementing each digit 𝑎𝑎𝑖𝑖
according to: 𝑎𝑎�𝑖𝑖 = 𝑟𝑟 − 1 − 𝑎𝑎𝑖𝑖
o In a binary representation (𝑟𝑟 = 2) this procedure equals a bitwise
inversion („bit flipping“): 01010101 à 10101010

𝐴𝐴+ = 0 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 𝐴𝐴− = 𝐴𝐴(𝑟𝑟−1) = 𝑟𝑟 − 1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0

𝑟𝑟 − 1 𝑛𝑛−1
𝑎𝑎𝑛𝑛−1 −
V 𝐴𝐴𝑟𝑟−1 = − 2 � 𝑟𝑟 𝑛𝑛 − 1 + � 𝑎𝑎 � 𝑟𝑟 𝑖𝑖
𝑖𝑖
𝑟𝑟 − 1
𝑖𝑖=0

● A signed integer number A has a range of: 𝑟𝑟 𝑛𝑛−1 < V 𝐴𝐴𝑟𝑟−1 < 𝑟𝑟 𝑛𝑛−1
● Same pros and cons as SM

28.03.2018 Selected Topics of VLSI Design 12


1.2 Signed Number Representations
● SNR #3: 𝒓𝒓- complement à 2‘s complement
o Start with (𝑟𝑟 -1)-complement and add 1 to the Least Significant
Digit (LSD)
o Binary format (r = 2) most commonly used in digital circuits

𝐴𝐴+ = 0 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 𝐴𝐴− = 𝐴𝐴𝑟𝑟 = 𝑟𝑟 − 1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 + 1

𝑟𝑟 − 1 𝑛𝑛−1
𝑎𝑎𝑛𝑛−1 −
V 𝐴𝐴𝑟𝑟 = − 2 � 𝑟𝑟 𝑛𝑛 + � 𝑎𝑎 � 𝑟𝑟 𝑖𝑖
𝑖𝑖
𝑟𝑟 − 1
𝑖𝑖=0

● A signed integer number A has a range of: 𝑟𝑟 𝑛𝑛−1 ≤ V 𝐴𝐴𝑟𝑟 < 𝑟𝑟 𝑛𝑛−1
● + Identical treatment of positive and negative numbers in arithmetic
circuits, e.g., adders; unique representation of zero
● - Asymmetrical range

28.03.2018 Selected Topics of VLSI Design 13


1.2 Signed Number Representations
● SNR #4: Redundant Representations (RR)
o Allow multiple (redundant) representations for the same number
values V(A)
o Also true for SM and (𝑟𝑟 -1)-complement due to double zero
representation, but typically RR means the following:

● RR #1: Signed Digit Representation (SD)


o In SD numbers each digit has its own sign à one extra bit per digit

𝑟𝑟 − 1
𝑎𝑎𝑖𝑖 ∈ −𝛼𝛼, … , −1,0,1, … , 𝛽𝛽 𝑤𝑤𝑤𝑤𝑤𝑤𝑤 ≤ 𝛼𝛼, 𝛽𝛽 ≤ 𝑟𝑟 − 1
2
o α and β must cover at least half of the interval defined by the radix
𝑛𝑛−1

𝐴𝐴𝑆𝑆𝑆𝑆 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 V 𝐴𝐴𝑆𝑆𝑆𝑆 = � 𝑎𝑎𝑖𝑖 � 𝑟𝑟 𝑖𝑖


𝑖𝑖=0

28.03.2018 Selected Topics of VLSI Design 14


1.2 Signed Number Representations
● SD number system is symmetrical for 𝛼𝛼 = 𝛽𝛽, else asymmetrical
● Maximum or minimum redundancy for symmetrical SD number system:
o Maximum redundancy à 𝛼𝛼 = 𝑟𝑟 − 1
𝑟𝑟−1
o Minimum redundancy à 𝛼𝛼 =
2
● Examples:

Radix r Digit values ai


2 {-1, 0, 1}
{-2, -1, 0, 1}
3 {-1, 0, 1, 2}
{-2, -1, 0, 1, 2}
{-2, -1, 0, 1, 2} à minimum redundancy
4 {-3, -2, -1, 0, 1} à not allowed! α, β bounds violated
{-3, -2, -1, 0, 1, 2, 3} à maximum redundancy

● Only SD numbers with 𝑟𝑟 = 2 (redundant binary (RB) numbers) are


considered in the following sections à 𝑎𝑎𝑖𝑖 ∈ −1, 0, 1
3/28/2018 Selected Topics of VLSI Design 15
1.2 Signed Number Representations
● A SD number with 𝑛𝑛 digits of 𝑎𝑎𝑖𝑖 ∈ −1, 0, 1 has 3𝑛𝑛 different
representations, but only 2𝑛𝑛+1 − 1 different values can be represented
o Example: −310 = 01� 1� = 1� 01 = 1� 11�

● Question: Which RB representation contains the smallest amount of


non-zeros (‘1‘ or ‘-1’)?
o Answer: use arithmetic conversions of non-zero bit-strings

 Example: … 001111 … 111000 … à … 010000 … 001000 …

𝑉𝑉 𝐴𝐴+ = 2𝑖𝑖+𝑘𝑘−1 + 2𝑖𝑖+𝑘𝑘−2 + ⋯ + 2𝑖𝑖+1 + 2𝑖𝑖 = 2𝑖𝑖+𝑘𝑘 − 2𝑖𝑖

𝑉𝑉(𝐴𝐴− ) = −2𝑖𝑖+𝑘𝑘−1 + 2𝑖𝑖+𝑘𝑘−2 + ⋯ + 2𝑖𝑖+1 + 2𝑖𝑖 = −2𝑖𝑖

● Such RBRs are called Canonical Signed Digits (CSD) and the
conversion strategy is CSD-Recoding

3/28/2018 Selected Topics of VLSI Design 16


1.2 Signed Number Representations
● Definition: A CSD recoded number is an 𝑛𝑛 digit SD number that has a
minimum amount of non-zeros (‘1’ and ‘-1’) and no adjacent non-zero
digits
𝒏𝒏−𝟏𝟏

� 𝒂𝒂𝒊𝒊 ≝ 𝒎𝒎𝒎𝒎𝒎𝒎 𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝒂𝒂𝒊𝒊 � 𝒂𝒂𝒊𝒊−𝟏𝟏 ≝ 𝟎𝟎 𝑓𝑓𝑓𝑓𝑓𝑓 1 ≤ 𝑖𝑖 ≤ 𝑛𝑛 − 1


𝒊𝒊=𝟎𝟎

● CSD-Recoding operates as iterative and sequential algorithm. Step by


step the number is parsed from the least to the most significant digit/bit
(“right to left”) to detect strings of adjacent non-zeros, which are
converted immediately. The algorithm terminates if the formulated
condition of 𝑎𝑎𝑖𝑖 � 𝑎𝑎𝑖𝑖−1 ≝ 0 is met!
36610 = 0001 0110 1110 -21310 = 1� 111 0010 1011
= 0001 0111 001� 0 = 1� 111 0010 1101�
= 0001 1001� 001� 0 = 1� 111 0011 01� 01�
CSD = 0010 1� 001� 001� 0 = 1� 111 0101� 01� 01�
CSD = 0001� 0101� 01� 01�
3/28/2018 Selected Topics of VLSI Design 17
1.2 Signed Number Representations
● Lookup-table for CSD-Recoding
o possible 1� or 1 carries from lower positions must be considered
(𝒄𝒄𝒊𝒊+𝟏𝟏 à 𝒄𝒄𝒊𝒊 in next step)
Binary Number CSD recoded SD
𝒂𝒂𝒊𝒊+𝟏𝟏 𝒂𝒂𝒊𝒊 𝒄𝒄𝒊𝒊 𝒂𝒂∗𝒊𝒊 𝒄𝒄𝒊𝒊+𝟏𝟏 Comment
0 0 0 0 0 String of zeros
0 1 0 1 0 Singular non-zero
1 0 0 0 0 String of zeros
1 1 0 1� 1 Begin of non-zero string
0 0 1 1 0 End of non-zero string
0 1 1 0 1 String of non-zeros
1 0 1 1� 1 Singular zero
1 1 1 0 1 String of non-zeros

o CSD recoding yields minimum | average | maximum


𝑛𝑛 𝑛𝑛+1
minimum # of non-zeros: 0 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡! ∼
3 2

3/28/2018 Selected Topics of VLSI Design 18


1.2 Signed Number Representations
● Number dependent variable timing and sequential nature of CSD-
recoding prohibit its efficient application at run-time. But it is excellent
for the recoding of constant values or coefficients at design-time.
o Each eliminated non-zero saves hardware and speeds up specific
arithmetic circuits (i.e. multipliers)
● Alternatively, parallel algorithms Booth and modified Booth will work
faster for non-zero recoding at run-time.
o However, Booth algorithm does not find the minimal form in each
case (see example)
o Isolated non-zeros “010“ are not considered by this version
Binary Number Booth recoded SD (a-1 = 0)
0 0 1 0 1 0 1 0 1 (0)
𝒂𝒂𝒊𝒊 𝒂𝒂𝒊𝒊−𝟏𝟏 𝒂𝒂∗𝒊𝒊 Comment
0 0 0 String of zeros
1 1 0 String of non-zeros
1 0 �1 Begin of non-zero string
0 1 1 End of non-zero string 0 1 1 1 1 1 1 1 1
3/28/2018 Selected Topics of VLSI Design 19
1.2 Signed Number Representations
● Modified Booth improves on standard Booth algorithm by overlapped
bit scanning of 3 bit strings
o By considering isolated non-zeros “010“ the maximum amount of
non-zeros after conversion is n/2 (for even numbers of n)

Modified Booth recoded SD ( i = 1,3,5,… )


Binary Number
r=2 r=4
Comment
𝒂𝒂𝒊𝒊 𝒂𝒂𝒊𝒊−𝟏𝟏 𝒂𝒂𝒊𝒊−𝟐𝟐 𝒂𝒂∗𝒊𝒊 𝒂𝒂∗𝒊𝒊−𝟏𝟏
0 0 0 0 0 0 String of zeros
0 1 0 0 1 1 Single non-zero
1 0 0 1� 0 -2 Begin of non-zero string
1 1 0 0 1� -1 Begin of non-zero string
0 0 1 0 1 1 End of non-zero string
0 1 1 1 0 2 End of non-zero string
1 0 1 0 1� -1 Single zero
1 1 1 0 0 0 String of non-zeros

3/28/2018 Selected Topics of VLSI Design 20


1.2 Signed Number Representations
● Example: Modified Booth Recoding for a 12 digit number
o Result is no CSD, but acceptable!
33610 = 0 0 0 1 0 1 1 0 1 1 1 0 0 n = 12

r=2 à 0 0 0 1 1 0 0 1 0 0 1 0
r=4 à 0 1 2 1 0 2

● Comparison of recoding methods


# of non-zeros
Method Algorithm Comment
Min Average Max
𝑛𝑛 𝑛𝑛 + 1 Yields minimum # of non-zeros à for
CSD Sequential 0 ~
3 2 constant values at design-time
𝑛𝑛
Booth Parallel 0 ?~ 𝑛𝑛 1’s Complement to Signed Digit
3
Modified 𝑛𝑛 𝑛𝑛 + 1 for run-time recoding (multiplier) à
Parallel 0 ?~
Booth 3 2 potential to save half of the chip area

3/28/2018 Selected Topics of VLSI Design 21


1.2 Signed Number Representations
● Conversion from 2‘s complement to SD numbers (Ar à ASD)

𝐴𝐴𝑟𝑟 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 à 𝐴𝐴𝑆𝑆𝑆𝑆 = −𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0

o Fast: can be done in parallel within one gate delay


o 510 = 01012 à 0101𝑆𝑆𝑆𝑆𝑆
o −510 = 10112  �1011𝑆𝑆𝑆𝑆𝑆

● Conversion from SD numbers to 2‘s complement (ASD à Ar)


o Split SD number into positive and negative fraction ASD à D+ and D-
o -1310 = 01� 011� 1SD à D+ = 000101 and D- = 010010
o 2‘s complement number à Ar = D+- D-
o Slow: requires one n-bit addition
3/28/2018 Selected Topics of VLSI Design 22
1.2 Signed Number Representations
● SD numbers in binary hardware
o Three possible values per digit 𝑎𝑎𝑖𝑖 ∈ −1, 0, 1 require two bit for
each digit à Hardware costs (wires, registers, ALUs) double
o Two bit per digit allow four different encodings, but two of them are
typically used à Sign Value & Negative Positive

Sign Value (SV) Negative Positive (NP)


ai S V N P
-1 1 1 1 0
0 0 0 0 0
1 0 1 0 1
easier ASDàAr conversion
intuitive because of its Sign
comment as D+ = Pn-1….P0 and
Magnitude representation
D- = Nn-1…N0

3/28/2018 Selected Topics of VLSI Design 24


1.2 Signed Number Representations
● RR #2: Carry-Save Representation (CS)
o Carry-Save numbers originate from hardware structures of full
adders (FAs) and half adders (HAs)
an-1 bn-1 an-2 bn-2 a1 b 1 a0 b 0

cn FA FA ... FA FA c0

sn-1 sn-2 s1 s0
an-1 bn-1 an-2 bn-2 a1 b 1 a0 b0 c0

HA HA ... HA HA

c'n s'n-1 c'n-1 s'n-2 c'2 s'1 c'1 s'0 c0

o Digit 𝑎𝑎𝑖𝑖 represents a tuple: 𝑎𝑎𝑖𝑖 = 𝑠𝑠𝑖𝑖′ 𝑐𝑐𝑖𝑖+1


′ ′
= 2 � 𝑐𝑐𝑖𝑖+1 + 𝑠𝑠𝑖𝑖′
o CS numbers are stored as combination of a carry- and intermediate
sum-vector
3/28/2018 Selected Topics of VLSI Design 25
1.2 Signed Number Representations
● Additions with CS number only have a critical path of one half adder,
but require 2 bit per digit storage and communication (wires)

𝑛𝑛−1

𝐴𝐴𝐶𝐶𝐶𝐶 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 V 𝐴𝐴𝐶𝐶𝐶𝐶 = � 𝑎𝑎𝑖𝑖 � 𝑟𝑟 𝑖𝑖


𝑖𝑖=0

′ ′
𝑐𝑐𝑛𝑛 𝑐𝑐𝑛𝑛−1 𝑐𝑐𝑛𝑛−2 … . 𝑐𝑐1′ 𝑐𝑐0
● 𝑉𝑉(𝐴𝐴𝐶𝐶𝐶𝐶 ) = � ′ ′
+ 𝑠𝑠𝑛𝑛−1 𝑠𝑠𝑛𝑛−2 … . 𝑠𝑠1′ 𝑠𝑠0

● In general, there is no difference between CS and SD numbers


o CS numbers result from the outputs of a half adder (HA)
o SD numbers have their origin in theory of number representations

3/28/2018 Selected Topics of VLSI Design 26


1.2 Signed Number Representations
● Why should RRs be applied or when is it worth to use them?
Pros Cons
- Carry-free and thus faster addition / - More resources
subtraction (see adder section) - Comparison operations (≥, ≤, <, =, >)
- Arithmetic algorithms based on adders are slow due to ASDàAr conversion
(nearly all) benefit from this - ASDàAr conversion slow due to adder

Ar à ASD T ~ O(1)

Operation 1 Top,i ≠ f(n)


… Carry-free operations!
Operation k

ASD à Ar Tadd ~ O(log2(n))


3/28/2018 Selected Topics of VLSI Design 27
1.3 Rounding
● Rounding trims numbers into formats with fewer digits
o Examples
 Two n bit numbers are multiplied and the result will be a number with
2n bits, but hardware only captures m < 2n bits
 Rounding after right shift by one digit of an integer

● Rounding methods can be classified as follows:


o Accuracy of the final results (or information loss by rounding)
o Numerical error characteristics of the rounding method
o Cost/effort/delay to perform the rounding

● Assume
o Given: 𝐴𝐴 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0 . 𝑎𝑎−1 … 𝑎𝑎−𝑑𝑑 à Cut 𝑑𝑑 bits
o Rounded: 𝐵𝐵 = 𝑏𝑏𝑛𝑛−1 𝑏𝑏𝑛𝑛−2 … 𝑏𝑏1 𝑏𝑏0 = 𝐴𝐴 + 𝜀𝜀 ⇒ 𝜀𝜀 = 𝐵𝐵 − 𝐴𝐴
o Goal: Minimize rounding error 𝜀𝜀

28.03.2018 Selected Topics of VLSI Design 28


1.3 Rounding
● Rounding Method #1: Truncation
o Step 1: 𝑑𝑑 least significant bits are cut off from 𝐴𝐴
o Rounding result à 𝐵𝐵𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1 𝑎𝑎0
o Minimum error à 𝜀𝜀𝑚𝑚𝑖𝑖𝑖𝑖 = 0.000002
Position –(𝑑𝑑+1)
o Maximum error à 𝜀𝜀𝑚𝑚𝑎𝑎𝑎𝑎 = − 1 − 2−𝑑𝑑 = −0.111112
𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 −𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 1 1
o Average error à 𝜀𝜀𝑎𝑎𝑎𝑎𝑎𝑎 = =− + = −0.100001� 2
2 2 2𝑑𝑑+1
o Asymmetrical bias

4
3
2
1
A
1 2 3 4 5
3/28/2018 Selected Topics of VLSI Design 29
1.3 Rounding can be often incorporated
effortlessly into previous operation
● Rounding Method #2: Round-to-Nearest
o Step 1: Addition of 0.510 to 𝐴𝐴 ⇒ 𝐴𝐴′ = 𝐴𝐴 + 0.510 = 𝐴𝐴 + 0.12
o Step 2: 𝑑𝑑 least significant bits are cut off from 𝐴𝐴′ to fit 𝐵𝐵
o Resulting effect is an alternate rounding to higher & lower numbers
′ ′
o Rounding result à 𝐵𝐵𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 = 𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1′ 𝑎𝑎0′
o Minimum error à 𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 = 0.00000 (for A=0.0 à B=0.0)
o Maximum error à 𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 = 2−1 = 0.1 (for A=0.1 à B=1.0)
𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 −𝜀𝜀𝑚𝑚𝑚𝑚𝑚𝑚 2−1
o Average error à 𝜀𝜀𝑎𝑎𝑎𝑎𝑎𝑎 = 2
= 2
= 2−2 = 0.01
o Smaller asymmetrical bias (due to always rounding up of A=0.1)
B
4
3
2
1
A
1 2 3 4 5
28.03.2018 Selected Topics of VLSI Design 30
1.3 Rounding Idea: alternate rounding up and
down to nearest even number
● Rounding Method #3: Round-to-Nearest-Even
o Step 1: Addition of 0.510 to 𝐴𝐴 ⇒ 𝐴𝐴′ = 𝐴𝐴 + 0.510 = 𝐴𝐴 + 0.12
o Step 2: 𝑑𝑑 least significant bits of 𝐴𝐴′ are zero à cut off from 𝐴𝐴′ to fit 𝐵𝐵 and
set 𝑎𝑎0′ to zero, otherwise proceed with Round-to-Nearest
o Yields average bias of zero!
′ ′
𝐵𝐵𝑟𝑟𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 𝑎𝑎−1 … 𝑎𝑎−𝑑𝑑 ≠ 0.000 …
o à 𝐵𝐵𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟,𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 =� ′ ′
𝑎𝑎𝑛𝑛−1 𝑎𝑎𝑛𝑛−2 … 𝑎𝑎1′ 0 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
o à 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 = 0
o Symmetrical error and bias-free, mandatory in IEEE Floating Point

4
3
2
1

1 2 3 4 5
28.03.2018 Selected Topics of VLSI Design 31
1.4 Overflow
● Overflow occurs if numbers exceed available word length in datapaths

000...0

011...1
100...0

111...1
-2 n-1 0 2 n-1 2n

unsigned

2´s complement

1´s complement

sign magnitude

28.03.2018 Selected Topics of VLSI Design 32


1.4 Overflow
● Overflow in 2‘s complement numbers
o range à −2𝑛𝑛−1 ≤ V 𝐴𝐴𝑟𝑟 < 2𝑛𝑛−1
o Overflow in addition of two numbers
 Reason: Carry out from sign digit is discarded
 Case 1: Two positive summands A and B à negative sum S
 𝑎𝑎𝑛𝑛−1 ∧ 𝑏𝑏𝑛𝑛−1 ∧ 𝑠𝑠𝑛𝑛−1 ⇒ 𝑐𝑐𝑖𝑖𝑖𝑖 = 1, 𝑐𝑐𝑜𝑜𝑜𝑜𝑜𝑜 = 0
 Case 2: Two negative summands A and B à positive sum S
 𝑎𝑎𝑛𝑛−1 ∧ 𝑏𝑏𝑛𝑛−1 ∧ 𝑠𝑠𝑛𝑛−1 ⇒ 𝑐𝑐𝑖𝑖𝑖𝑖 = 0, 𝑐𝑐𝑜𝑜𝑜𝑜𝑜𝑜 = 1
 In general, overflow occurs for 𝑐𝑐𝑖𝑖𝑖𝑖 ≠ 𝑐𝑐𝑜𝑜𝑜𝑜𝑜𝑜 at sign digit (for add & sub)
an-1 bn-1

● In non-redundant number systems overflows are


cout FA cin definitely detectable

sn-1 ● Possible actions after overflow detection


o Emergency stop
Saturation Logic o Error handling
o Saturation to maximum (01111) or minimum
overflow
(10000) number
s*n-1
28.03.2018 Selected Topics of VLSI Design 33
1.4 Overflow
● Overflow in Carry-Save representations
o In redundant number systems two types of overflow exist
 True and pseudo overflow
o Example: 0.510 + (-0.510) + 0 = 0 !!!
-20 2-1 2-2
0 1 0 0.510
1 1 0 -0.510
0 0 0 0
0 1 0 carry vector = -110
1 0 0 sum vector = -110

o Wrong intermediate result -210 in CS representation would yield


correct value 0 if converted to 2’s complement via vector merging
addition (VMA) of carry and sum vector
o Test: 1.00 + 1.00 = 10.00 (dropped) à Result = 0.00 = 010

28.03.2018 Selected Topics of VLSI Design 34


1.4 Overflow
o Wrong results possible if other operations are executed on
intermediate carry and sum vector
o Example: (0.510 + (-0.510) + 0) ∙ 0.510 = 0
 Multiplication with 0.5 equals right shift with sign extension of carry and
sum vector
 Carry: 1.00 : 2 à 1.10 = - 0,510
 Sum: 1.00 : 2 à 1.10 = - 0,510
------
 VMA: 11.00 = - 110 ≠ 010
o Error becomes obvious after conversion to non-redundant number.
However, correct result 0 would fit into given word length

o Those pseudo overflows are detectable and correctable as


follows

28.03.2018 Selected Topics of VLSI Design 35


1.4 Overflow
o Pseudo overflow correction for CS numbers:
 Given: 𝑐𝑐1 𝑐𝑐0 . 𝑐𝑐−1 𝑐𝑐−2 …
𝑠𝑠0 . 𝑠𝑠−1 𝑠𝑠−2 … 𝑠𝑠−(𝑚𝑚−1)
 Modify to: 𝒄𝒄′𝟎𝟎 . 𝑐𝑐−1 𝑐𝑐−2 …
𝒔𝒔′𝟎𝟎 . 𝑠𝑠−1 𝑠𝑠−2 …
𝑐𝑐1 𝑖𝑖𝑖𝑖 𝑐𝑐1 ≠ 𝑐𝑐0
using 𝑐𝑐0′ = 𝑐𝑐1 and 𝑠𝑠0′ = �  𝑠𝑠0′ = 𝑠𝑠0 ⨂𝑐𝑐1 ⨂𝑐𝑐0
𝑠𝑠0 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
 XOR gates can be easily integrated as part of the MSD/MSB adder
circuit at low hardware overhead without speed penalty!
o Method works as long as the converted 2‘s complement result fits
into the given word length
o Example with pseudo overflow correction:

28.03.2018 Selected Topics of VLSI Design


 36
1.4 Overflow
● In general, a reduction of leading digits of CS numbers can be
achieved
o provided that CS number fits into corresponding 2’s complement
number according to condition -1 ≤ C + S ≤ (1-2-(n-1))
● as follows:
 Given: 𝑐𝑐𝑛𝑛 … . 𝑐𝑐1𝑐𝑐0 . 𝑐𝑐−1 𝑐𝑐−2 …
𝑠𝑠𝑛𝑛 … 𝑠𝑠1 𝑠𝑠0 . 𝑠𝑠−1 𝑠𝑠−2 …
 Modify to: 𝒄𝒄′𝟎𝟎 . 𝑐𝑐−1 𝑐𝑐−2 …
𝒔𝒔′𝟎𝟎 . 𝑠𝑠−1 𝑠𝑠−2 …
𝑠𝑠0 𝑖𝑖𝑖𝑖 𝑠𝑠1 ≠ 𝑐𝑐1 𝑐𝑐 𝑖𝑖𝑖𝑖 𝑠𝑠1 ≠ 𝑐𝑐1
using 𝑠𝑠0′ = � 𝑐𝑐0′ = � 0
𝑠𝑠0 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑐𝑐0 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒

● Pseudo overflow correction needs less digits and chip area than
uncorrected formats

28.03.2018 Selected Topics of VLSI Design 37


1.4 Overflow
● Overflow in SD numbers
o Similar to the CS case à Pseudo and real overflows
o Thereby, the overflow behavior depends on the MSD sum bit 𝑠𝑠𝑛𝑛−1
and the intermediate carry bit 𝑑𝑑𝑛𝑛

o Analysis for possible correction of 𝑠𝑠𝑛𝑛−1 as follows:
o Pseudo overflow correctable at
𝑑𝑑𝑛𝑛 𝑠𝑠𝑛𝑛−1 Overflow Type 𝒔𝒔′𝒏𝒏−𝟏𝟏 MSD without performance impact

1 N pseudo 1
o Real overflow must be avoided
1 0 potential through modification at the
system or algorithm level
1 1 real
N N real o Potential overflow would require
N 0 potential an inspection of all lower digits
à Hardware costs increase
N 1 pseudo N
0 X none 𝑠𝑠𝑛𝑛−1 o Potential overflow avoidable via
range limitation to < 2𝑛𝑛−2

28.03.2018 Selected Topics of VLSI Design 38


1.4 Overflow
● General options/mechanisms for handling of real overflow
o Analytical analysis to identify minimum/maximum intermediate and
final values
o Corner case simulation of the system to check for sufficient word
lengths for any occurring values
o Thus estimate lower bound on word length
o For insufficient word lengths or if too expensive:
 Reduce accuracy à less bits after decimal point
 Test whether application allows saturation
 Detect real overflow and handle it

28.03.2018 Selected Topics of VLSI Design 39


1.5 Basic Operations
● Wrap up of some basic operations on data and numbers

Operation
left 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎 𝟎𝟎
unsigned
right 𝟎𝟎𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏
Shift
signed left 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟑𝟑 … 𝒂𝒂𝟎𝟎 𝟎𝟎
2‘s complement right 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟏𝟏 … 𝒂𝒂𝟏𝟏
left 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎 𝒂𝒂𝒏𝒏−𝟏𝟏
Rotate
right 𝒂𝒂𝟎𝟎 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏
left 𝟎𝟎𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎
unsigned
right 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎 𝟎𝟎
Extend
signed left 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎
2‘s complement right 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟐𝟐 … 𝒂𝒂𝟏𝟏 𝒂𝒂𝟎𝟎 𝟎𝟎
unsigned 𝒂𝒂𝒏𝒏−𝟏𝟏 … 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟏𝟏
Saturate
signed 2‘s complement 𝒂𝒂𝒏𝒏−𝟏𝟏 𝒂𝒂𝒏𝒏−𝟏𝟏 … 𝒂𝒂𝒏𝒏−𝟏𝟏

28.03.2018 Selected Topics of VLSI Design 40


1.6 Cost/Performance Estimation Basics
● Some „Rule of Thumb“ estimations for delay and area of typical functions and
algorithm structures in arithmetic circuits
o Naming conventions:
 𝐴𝐴 à Area
 𝑇𝑇 à Cycle time/delay
 𝐿𝐿 à Latency
 # à Number of cycles
o Basic assumption for gates:
 Inverter, Buffer à 𝐴𝐴 = 0 , 𝑇𝑇 = 0 (negligible)
 Simple 2-Input gate à 𝐴𝐴 = 1 , 𝑇𝑇 = 1 (AND, NAND, OR, NOR)
 Special 2-Input gate à 𝐴𝐴 = 2 , 𝑇𝑇 = 2 (XOR, XNOR)
 Complex m-Input gate à 𝐴𝐴 = 𝑚𝑚 − 1 , 𝑇𝑇 = 𝑙𝑙𝑙𝑙𝑙𝑙2 (𝑚𝑚) (gate tree)
 Wiring costs as well as area not considered (high abstraction)
o Basic assumptions for circuit function:
 Up to n inputs à 𝑎𝑎𝑖𝑖 = {𝑎𝑎𝑛𝑛−1 , 𝑎𝑎𝑛𝑛−2 , … , 𝑎𝑎1 , 𝑎𝑎0 }
 Up to n outputs à 𝑧𝑧𝑖𝑖 = {𝑧𝑧𝑛𝑛−1 , 𝑧𝑧𝑛𝑛−2 , … , 𝑧𝑧1 , 𝑧𝑧0 }
 Blue dots represent functions that generate outputs
𝑧𝑧𝑖𝑖 = 𝑓𝑓( 𝑎𝑎𝑛𝑛−1 , 𝑎𝑎𝑛𝑛−2 , … , 𝑎𝑎1 , 𝑎𝑎0 )

28.03.2018 Selected Topics of VLSI Design 41


1.6 Cost/Performance Estimation Basics
o Non-recursive functions
 𝑧𝑧𝑖𝑖 = 𝑓𝑓 𝑎𝑎𝑖𝑖 , 𝑥𝑥 𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑖𝑖 = 0, 1, … , 𝑛𝑛 − 1 𝑎𝑎𝑎𝑎𝑎𝑎 𝑥𝑥 = 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
 output 𝑧𝑧𝑖𝑖 only depends on input 𝑎𝑎𝑖𝑖
 can be implemented as fully parallel hardware structure
 𝐴𝐴 = 𝑂𝑂 𝑛𝑛 and 𝑇𝑇 = 𝑂𝑂 1

an-1 an-2 ... a1 a0

zn-1 zn-2 ... z1 z0

28.03.2018 Selected Topics of VLSI Design 42


1.6 Cost/Performance Estimation Basics
o Recursive functions with single output
 Output depends on all inputs à 𝑧𝑧𝑖𝑖 = 𝑓𝑓 𝑎𝑎𝑛𝑛−1 , 𝑎𝑎𝑛𝑛−2 , … , 𝑎𝑎1 , 𝑎𝑎0
 Case 1: f non-associative à 𝐴𝐴 = 𝑂𝑂 𝑛𝑛 and 𝑇𝑇 = 𝑂𝑂 𝑛𝑛 (serial structure)
 Case 2: f associative à 𝐴𝐴 = 𝑂𝑂 𝑛𝑛 and 𝑇𝑇 = 𝑂𝑂 𝑙𝑙𝑙𝑙𝑙𝑙2 (𝑛𝑛) (tree structure)

Case 1: non-associative Case 2: associative


an-1 an-2 ... a1 a0
a3 a2 a1 a0

z3
zn-1
3/28/2018 Selected Topics of VLSI Design 43
1.6 Cost/Performance Estimation Basics
o Recursive functions with multiple outputs
 Prefix problem à 𝑧𝑧𝑖𝑖 = 𝑓𝑓 𝑎𝑎𝑖𝑖 , 𝑧𝑧𝑖𝑖−1
 Case 1: f non-associate à 𝐴𝐴 = 𝑂𝑂 𝑛𝑛 and 𝑇𝑇 = 𝑂𝑂 𝑛𝑛 (serial)
 Case 2: f associative à 𝐴𝐴 = 𝑂𝑂 𝑛𝑛2 and 𝑇𝑇 = 𝑂𝑂 𝑙𝑙𝑙𝑙𝑙𝑙2 (𝑛𝑛) (multi tree / serial)
 Case 3: f associative à 𝐴𝐴 = 𝑂𝑂 𝑛𝑛 ⋅ 𝑙𝑙𝑙𝑙𝑙𝑙2 (𝑛𝑛) and 𝑇𝑇 = 𝑂𝑂 𝑙𝑙𝑙𝑙𝑙𝑙2 (𝑛𝑛) (shared)

Case 1: non-associative Case 2: associative Case 3: associative


an-1 an-2 ... a1 a0

a3 a2 a1 a0 a3 a2 a1 a0

in
parallel

z3 z2 z1 z0 z3 z2 z1 z0

zn-1 zn-2 ... z1 z0


3/28/2018 Selected Topics of VLSI Design 44

You might also like