P. 1

|Views: 1|Likes:

### Availability:

See more
See less

10/06/2013

pdf

text

original

in FPGAs
ECE 645: Lecture 3
Chapter 5, Basic Addition and Counting,
Sections 5.1-5.5, pp. 75-85.
Behrooz Parhami,
Computer Arithmetic: Algorithms and Hardware Design
Chapter 9, Using Carry and Arithmetic Logic
Spartan-3 Generation FPGA User Guide
http://www.xilinx.com/support/documentation/spartan-3_user_guides.htm

x
y
c
s
HA
x + y = ( c s )
2

2 1
x y c
s
0
0
1
1
0
1
0
1
0
0
0
1
0
1
1
0
Alternative implementations (1)

s = xy + xy

b)
a)

c = xy
c = x + y

c)
c = xy
s = xc + yc = xc · yc
Alternative implementations (2)
x
y
c
out

s
FA
x + y + c
in
= ( c
out
s )
2

2 1
x y
c
out
s
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
c
in

0
1
0
1
0
1
0
1
c
in

Alternative implementations (1)
a)
in
c
out
= xy + c
in
s
c
c
Alternative implementations (2)

in
= xyc
in
+ xyc
in
+ xyc
in
+ xyc
in
c
out
= xy + xc
in
+ yc
in
b)
Alternative implementations (3)
c)
x y c
out

s
0
0
1
1
0
1
0
1
0

1
c
in

c
in

c
in

c
in

c
in

c
in

x
y
A2
A1
XOR
D
0 1
C
in
C
out
S
p
g
Alternative implementations (4)
Implementation used to generate fast carry logic
in Xilinx FPGAs
x y c
out

0
0
1
1
0
1
0
1
y

y
c
in

c
in

g = y
in
in

Latency of a k-bit ripple-carry adder
T
= T
FA
(x,y÷c
out
) +
+ (k-2) · T
FA
(c
in
÷c
out
) +
+ T
FA
(c
in
÷s)
Latency ~ k · T
FA

Latency · k
Overflow for signed numbers (1)
Indication of overflow
Positive
+ Positive
= Negative
Negative
+ Negative
= Positive
Formulas
Overflow
2’s complement
= x
k-1
y
k-1
s
k-1
+ x
k-1
y
k-1
s
k-1
=
= c
k
k-1

Overflow for signed numbers (2)
x
k-1
y
k-1
c
k-1
c
k
s
k-1
overflow c
k
k-1

0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
0
1
0
0
0
0
1
0
Technology Low-cost High-
performance
120/150 nm Virtex 2, 2 Pro
90 nm Spartan 3 Virtex 4
65 nm Virtex 5
45 nm Spartan 6
40 nm Virtex 6
Xilinx FPGA Devices
Altera FPGA Devices
Technology Low-cost Mid-range High-
performanc
e
130 nm Cyclone Stratix
90 nm Cyclone II Stratix II
65 nm Cyclone III Arria I Stratix III
40 nm Cyclone IV Arria II Stratix IV
23 ECE 448 – FPGA and ASIC Design with VHDL
Programmable
interconnect
Programmable
logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
General structure of an FPGA
24 ECE 448 – FPGA and ASIC Design with VHDL
25 ECE 448 – FPGA and ASIC Design with VHDL
CLB CLB
CLB CLB
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Configurable logic block (CLB)
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Xilinx Spartan 3 FPGAs
26 ECE 448 – FPGA and ASIC Design with VHDL
CLB Structure
27 ECE 448 – FPGA and ASIC Design with VHDL
CLB Slice Structure
• Each slice contains two sets of the
following:
• Four-input LUT
• Any 4-input logic function,
• or 16-bit x 1 sync RAM (SLICEM only)
• or 16-bit shift register (SLICEM only)
• Carry & Control
• Fast arithmetic logic
• Multiplier logic
• Multiplexer logic
• Storage element
• Latch or flip-flop
• Set and reset
• True or inverted inputs
• Sync. or async. control
28 ECE 448 – FPGA and ASIC Design with VHDL
LUT (Look-Up Table) Functionality
• Look-Up tables
are primary
elements for
logic
implementation
• Each LUT can
implement any
function of
4 inputs
x
1
x
2
x
3
x
4
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
x
1
x
2
x
3
x
4
y
x
1
x
2
x
3
x
4
y
x
1
x
2
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
29 ECE 448 – FPGA and ASIC Design with VHDL
COUT
D Q
CK
S
R
EC
D Q
CK
R
EC
O
G4
G3
G2
G1
Look-Up
Table
Carry
&
Control
Logic
O
YB
Y
F4
F3
F2
F1
XB
X
Look-Up
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CIN
CLK
CE
SLICE
Carry & Control Logic
x y
COUT

0
0
1
1
0
1
0
1
y

y
CIN
CIN
Generate = y
x
y
Carry & Control Logic in Xilinx FPGAs
Carry & Control Logic in Spartan 3 FPGAs
LUT
Hardwired (fast) logic
Simplified View of Spartan-3 FPGA
Carry and Arithmetic Logic in One
Logic Cell
Simplified View of Carry Logic in One Spartan 3 Slice
Critical Path for an
Xilinx Spartan 3 FPGAs
Number and Length of Carry Chains
for Spartan 3 FPGAs
Bottom Operand Input to Carry Out Delay
T
OPCYF
0.9 ns for Spartan 3
0.2 ns for Spartan 3
Carry Propagation Delay
t
BYP
Carry Input to Top Sum Combinational Output Delay
T
CINY
1.2 ns for Spartan 3
Critical Path Delays and Maximum Clock Frequencies
(into account surrounding registers)

Major Differences between Xilinx Families

Number of CLB slices
per CLB
Number of LUTs
per CLB slice
Look-Up Tables
stages per CLB slice
Spartan 3
Virtex 4
Virtex 5, Virtex 6,
Spartan 6
4-input 6-input
4
2
2
2
4
4
Altera Cyclone III
Logic Element (LE) – Normal Mode
Altera Cyclone III
Logic Element (LE) – Arithmetic Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) – Normal Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) – Arithmetic Mode
Bit-serial
x
i
y
i
s
i
c
0
start

c
i+1
clk

Digit-serial
d d
d
x
i
y
i
s
i
c
0
start

c
i+1
clk

x
k-1
x
k-2
. . . x
1
x
0

y
k-1
y
k-2
. . . y
1
y
0

variable
constant
+
x
k-1
x
k-2
. . . x
h+1
x
h
x
h-1
. . . x
0

y
k-1
y
k-2
. . . y
h+1
1 0 . . . 0
variable
constant
+
x
h
x
h-1
. . . x
0
s
k-1
s
k-2
. . . s
1
s
0
s
k-1
s
k-2
. . . s
h+1
. . .
HA/
MHA
HA/
MHA
HA/
MHA
HA/
MHA
x
0
x
h-1
x
h
x
h+1
x
h+2
x
k-1
x
k-2
. . .
. . .
. .
x
0
x
h-1
x
h
s
h+1
s
h+2
s
k-1
s
k-2
. . .
. . .
If
y
i
y
i
c
k
x
y
c
s
MHA
x + y + 1 = ( c s )
2

2 1
x y c
s
0
0
1
1
0
1
0
1
0
1
1
1
1
0
0
1
HA HA HA HA
x
1
x
2
x
k-1
x
k-2
. . .
. .
s
1
s
2
s
k-1
s
k-2
. . .
x
0
x
0
c
k
Incrementer
MHA MHA MHA MHA
x
1
x
2
x
k-1
x
k-2
. . .
. .
s
1
s
2
s
k-1
s
k-2
. . .
x
0
x
0
c
k
Decrementer
Possible solutions to the
carry propagate problem
1. Detect the end of propagation rather than wait for
the worst-case time
2. Speed-up propagation via
• carry skip
• carry select, etc
3. Limit carry propagation to within a small number of bits
4. Eliminate carry propagation through the redundant
number representation
Analysis of carry propagation
Probability of carry generation = (x
i
y
i
= 11)
4
1
Probability of carry propagation = (x
i
y
i
= 01 or 10)
2
1
Probability of carry anihilation = (x
i
y
i
= 00 or 11)
2
1
j j-1 . . . . . . . i+1 i
1 0 … 1 …0 … 1 1
1 1 … 0 …1 … 0 1
Probability of
carry propagating
from position
i to position j
=
11 or 00
01 or 10
1
2
1
÷ ÷i j
·
2
1
probability of
propagation
probability of
anihilation
=
i j ÷
2
1
Expected length of the carry chain
that starts at position i (1)
Expected length(i, k) =
i k
i k
i j
k
i j
i j
÷ ÷
|
.
|

\
|
÷ +
÷
|
.
|

\
|
÷
+ =
÷
¿
1
2
1
) (
2
1
1
1
) (
Length
of the
carry chain
Probability
of the given
length
Probability
of propagation
till the end of
Distance
till the end