Reconfigurable Computing Es Zg554 / Mel ZG 554 Session 4: BITS Pilani

Reconfigurable Computing
ES ZG554 / MEL ZG 554

Session 4 Pawan Sharma
BITS Pilani ps@pilani.bits-pilani.ac.in
Pilani Campus 24/01/2021
Last Lecture
– Some examples on SPLD

– CPLD
2
BITS Pilani, Pilani Campus
Today’s Lecture
I/O block of CPLD

Introduction to FPGA
3
Input/Output Block Architecture
• Provides interface between I/O pins and internal logic
• Provides many “analog” controls in addition to “logic” ones like output enables.
• Three different analog controls are provided:
• Slew-rate control. The rise and fall time of the output signals can be set to be
fast or slow. The fast setting provides the fastest possible propagation delay,
while the slow setting helps to control transmission-line ringing (the ringing
effect is due to the impedance discontinuities along the transmission path.
When a signal is incident on a signal line, the internal reflection and
transmission at the discontinuities generate signal components which
successively arrive at the output and thus a ringing effect occurs. This ringing
effect may cause fault logic switching and/or degrade the slew rate of the
transition edges, which introduces time skew) and system noise at the
expense of a small additional delay.
• Pull-up resistor. When enabled, the pull-up resistor prevents output pins from
floating as the CPLD is powered up. This is useful if the outputs are used to
drive active-low enable inputs of other logic that is not supposed to be
enabled during power up.
• User-programmable ground. Defines an unused I/O pin to be a ground pin.
Option directs the fitter to tie unused pins to ground internally, so that you do
not have to tie these pins to ground on your board. Also, useful in high-speed,
high- slew-rate applications. Extra ground pins are needed to handle the high
dynamic currents that flow when multiple outputs switch simultaneously.
• Provides compatibility with both 5-V and 3.3-V external devices.
• The input buffer and the internal logic run from a 5-V power supply (VCCINT).
• Depending on the operating voltage of external devices, the output driver uses
either a 5-V or a 3.3-V supply (VCCIO).

Switch Matrix
• Theoretically, a CPLD’s programmable interconnect should allow any internal PLD output or
external input pin to be connected to any internal PLD input. Likewise, it should allow any
internal PLD output to connect to any external output pin.
• The programmable interconnect is usually based on either array-based interconnect or
multiplexer-based interconnect:
• Array-based interconnect allows any signal within the programmable interconnect to
connect to any logic block within the CPLD. This is achieved by allowing horizontal and
vertical routing within the programmable interconnect and allowing the crossover points to
be connected or unconnected (the same idea as with the PLA and PAL), depending on the
CPLD configuration.
• Multiplexer-based interconnect uses digital multiplexers connected to each of the macrocell
inputs within the logic blocks. Specific signals within the programmable interconnect are
connected to specific inputs of the multiplexers. It would not be practical to connect all
internal signals within the programmable interconnect to the inputs of all multiplexers due
to size and speed of operation considerations.

Switch Matrix
XC 95108
• 108 internal macrocell outputs and 108 external-pin inputs – 216 signals to
switch matrix
• 6 FBs with 36 inputs each requires switch matrix to have 216 outputs, each
of which is a 216-input MUX driving one input of FBs AND array
• Building such a switch matrix using array based approach would require 216
columns for each input and 216 rows for each output and a pass transistor
for each cross point to control whether an input connects to the output or
not – not suitable design choice as large number of transistors connected to
each row or column makes for high capacitance, which makes for slow
speed. Therefore, CPLD manufacturers look for ways to reduce the size of
the switch matrix
• not every switch-matrix input must be connectable to every output. We only
need every input to be connectable to some input of each FB, since any FB
input can connect to any AND gate within the FB’s AND array.
• A MUX based approach maybe a better idea.
• Again, the requirement is not quite that simple
• Suppose that the switch-matrix output for FB input ‘𝑖𝑖’ (0 ≤ 𝑖𝑖 ≤ 35) is
created by a simple 8-input multiplexer whose inputs are switch-matrix
inputs 8i through 8i + 7.
• There are many interconnection patterns that cannot be accommodated by
such a switch matrix.
• For example, connecting switch-matrix inputs 0-35 to the same FB would not
be possible. As soon as switch-matrix input 0 is connected to FB input 0,
switch-matrix inputs 1–7 are blocked.
MUX based
Array based
Switch Matrix
• Thus, we need to redefine switch-matrix connectivity.

• For each FB, any combination of switch-matrix inputs must be
connectable to some combination of FB’s inputs.
• A typical CPLD switch matrix is a compromise between the
minimal multiplexer scheme and a full, nonblocking crosspoint
array.
• Finding complete set of connections is a NP-complete problem
• For different CPLD architectures, with varying connectivity
schemes, programming times may vary
• Thus, the design of a CPLD’s switch matrix is a compromise
between chip performance (speed, area, cost) and design tool
capabilities.

Field-Programmable Gate Arrays
Xilinx FPGAs are based on Configurable Logic Blocks (CLBs). More generally called
Programmable logic cells

Programmable Logic Cells
All FPGAs contain a basic programmable logic cell replicated in a
regular array across the chip
– configurable logic block, logic element, logic module, logic unit, logic array block, …
– many other names
There are three different types of basic logic cells:
– multiplexer based
– look-up table based
– programmable array logic (PAL-like)

Logic Cell Considerations
How are functions implemented?
– fixed functions (manipulate inputs only)
– programmable functionality (interconnect components)
Coarse-grained logic cells:

– support complex functions, need fewer blocks, but they are bigger so less of them on chip
Fine-grained logic cells:

– support simple functions, need more blocks, but they are smaller so more of them on chip

Logic Cell Considerations
When designing (or selecting) the type of logic cell
for an FPGA, some basic questions are important:
How many inputs?
How many functions?
– all functions of n inputs or eliminate some combinations?
– what inputs go to what parts of the logic cell?
Any specialized logic?
– adder, etc.
What register features?

FPGA Architectures
How can we implement any circuit in an FPGA?
– First, focus on combinational logic
– Example: Half adder
• Combinational logic represented by truth table
• What kind of hardware can implement a truth table?

Look-up-tables (LUTs)
Implement truth table in small memories (LUTs)
– Usually SRAM
Logic inputs connect to

address inputs, logic
output is memory output

Alternatively, could have used a 2-input, 2-output LUT
– Outputs commonly use same inputs

Full adder
– Combinational logic can be implemented in a LUT with same number of
inputs and outputs
• 3-input, 2-ouput LUT

Why aren’t FPGAs just a big LUT?
– Size of truth table grows exponentially based on # of inputs
• 3 inputs = 8 rows, 4 inputs = 16 rows, 5 inputs = 32 rows, etc.
– Same number of rows in truth table and LUT
– LUTs grow exponentially based on # of inputs
Number of SRAM bits in a LUT = 2i * o
– i = # of inputs, o = # of outputs
– Example: 64 input combinational logic with 1 output would require 264
SRAM bits
• 1.84 x 1019
Clearly, not feasible to use large LUTs
– So, how do FPGAs implement logic with many inputs?

Fortunately, we can map circuits onto multiple LUTs
– Divide circuit into smaller circuits that fit in LUTs (same # of inputs and
outputs)
– Example: 3-input, 2-output LUTs

What if circuit doesn’t map perfectly?
– More inputs in LUT than in circuit
• Truth table handles this problem

• Unused inputs are ignored
– More outputs in LUT than in circuit
• Extra outputs simply not used

– Space is wasted, so should use multiple outputs whenever possible

Important Point
– The number of gates in a circuit has no effect on the mapping
into a LUT
• All that matters is the number of inputs and outputs
• Unfortunately, it isn’t common to see large circuits with a few inputs
1 gate 1,000,000 gates
Both of these circuits can be implemented in a

single 3-input, 1-output LUT
Sequential Logic
Problem: How to handle sequential logic
– Truth tables don’t work
Possible solution:
– Add a flip-flop to the output of LUT
3-in, 1-out 3-in, 2-out

LUT LUT
etc.
FF FF FF

Sequential Logic
Example: 8-bit register using 3-input, 2-output LUTs
– Input: x, Output: y
x(7) x(6) x(5) x(4) x(3) x(2) x(1) x(0)
3-in, 2-out 3-in, 2-out 3-in, 2-out 3-in, 2-out

LUT LUT LUT LUT
FF FF FF FF FF FF FF FF
y(7) y(6) y(5) y(4) y(3) y(2) y(1) y(0)
– What does LUT need to do to implement register?

Sequential Logic
Example, cont.
– LUT simply passes inputs to appropriate output

Sequential Logic
Isn’t it a waste to use LUTs for registers?
YES! (when it can be used for something else)
– Commonly used for pipelined circuits
• Example: Pipelined adder
+ + 3-in, 2-out 3-in, 2-out

LUT LUT ....
Register Register
FF FF FF FF
+
Register Adder and output register combined – not
a separate LUT for each
Sequential Logic
Existing FPGAs don’t have a flip flop connected to LUT outputs
Why not?
– Flip flop has to be used!
• Impossible to have pure combinational logic

– Adds latency to circuit
Actual Solution:
– Configurable Logic Blocks (CLBs)

Configurable Logic Blocks (CLBs)
CLBs: the basic FPGA functional unit
– First issue: How to make flip-flop optional?
• Simplest way: use a mux

– Circuit can now use output from LUT or from FF
– Where does select come from?
3-in, 1-out
LUT
CLB
FF
2x1

CLBs usually contain more than 1 LUT
– Why?
• Efficient way of handling common I/O between adjacent LUTs
• Saves routing resources
2x1

CLB LUT LUT
FF FF FF FF
2x1 2x1 2x1 2x1

Example: Ripple-carry adder
– Each LUT implements 1 full adder
– Use efficient connections between LUTs for carry signals
A(1) B(1) A(0) B(0) Cin(0)
Cin(1)
2x1

CLB LUT LUT
FF FF FF FF
2x1 2x1 2x1 2x1

Cout(0)
Cout(1) S(1) S(0)

CLBs often have specialized connections between adjacent
CLBs
– Further improves carry chains
– Avoids routing resources
Some commercial CLBs even more complex
– Xilinx Virtex 4 CLB consists of 4 “slices”
• 1 slice = 2 LUTs + 2 FFs + other stuff
• 1 Virtex 4 CLB = 8 LUTs
– Altera devices has LABs (Logic Array Blocks)
• Consist of 16 LEs (logic elements) which each have 4 input LUTs

What Else?
Basic building block is CLB
– Can implement combinational+sequential logic
– All circuits consist of combinational and sequential logic
So what else is needed?

Reconfigurable Computing Es Zg554 / Mel ZG 554 Session 4: BITS Pilani

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reconfigurable Computing Es Zg554 / Mel ZG 554 Session 4: BITS Pilani

Uploaded by

Copyright:

Available Formats

Reconfigurable Computing

ES ZG554 / MEL ZG 554

– Some examples on SPLD

I/O block of CPLD

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Thus, we need to redefine switch-matrix connectivity.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Coarse-grained logic cells:

Fine-grained logic cells:

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Logic inputs connect to

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Truth table handles this problem

• Extra outputs simply not used

BITS Pilani, Pilani Campus

1 gate 1,000,000 gates

Both of these circuits can be implemented in a

3-in, 1-out 3-in, 2-out

BITS Pilani, Pilani Campus

x(7) x(6) x(5) x(4) x(3) x(2) x(1) x(0)

3-in, 2-out 3-in, 2-out 3-in, 2-out 3-in, 2-out

y(7) y(6) y(5) y(4) y(3) y(2) y(1) y(0)

– What does LUT need to do to implement register?

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

+ + 3-in, 2-out 3-in, 2-out

• Impossible to have pure combinational logic

BITS Pilani, Pilani Campus

• Simplest way: use a mux

BITS Pilani, Pilani Campus

3-in, 2-out 3-in, 2-out

2x1 2x1 2x1 2x1

BITS Pilani, Pilani Campus

3-in, 2-out 3-in, 2-out

2x1 2x1 2x1 2x1

Cout(1) S(1) S(0)

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

So what else is needed?

BITS Pilani, Pilani Campus

You might also like