You are on page 1of 29

Reconfigurable Computing

ES ZG554 / MEL ZG 554


Session 4 Pawan Sharma
BITS Pilani ps@pilani.bits-pilani.ac.in
Pilani Campus 24/01/2021
Last Lecture

– Some examples on SPLD


– CPLD

2
BITS Pilani, Pilani Campus
Today’s Lecture

I/O block of CPLD


Introduction to FPGA

3
BITS Pilani, Pilani Campus
Input/Output Block Architecture
• Provides interface between I/O pins and internal logic
• Provides many “analog” controls in addition to “logic” ones like output enables.
• Three different analog controls are provided:
• Slew-rate control. The rise and fall time of the output signals can be set to be
fast or slow. The fast setting provides the fastest possible propagation delay,
while the slow setting helps to control transmission-line ringing (the ringing
effect is due to the impedance discontinuities along the transmission path.
When a signal is incident on a signal line, the internal reflection and
transmission at the discontinuities generate signal components which
successively arrive at the output and thus a ringing effect occurs. This ringing
effect may cause fault logic switching and/or degrade the slew rate of the
transition edges, which introduces time skew) and system noise at the
expense of a small additional delay.
• Pull-up resistor. When enabled, the pull-up resistor prevents output pins from
floating as the CPLD is powered up. This is useful if the outputs are used to
drive active-low enable inputs of other logic that is not supposed to be
enabled during power up.
• User-programmable ground. Defines an unused I/O pin to be a ground pin.
Option directs the fitter to tie unused pins to ground internally, so that you do
not have to tie these pins to ground on your board. Also, useful in high-speed,
high- slew-rate applications. Extra ground pins are needed to handle the high
dynamic currents that flow when multiple outputs switch simultaneously.
• Provides compatibility with both 5-V and 3.3-V external devices.
• The input buffer and the internal logic run from a 5-V power supply (VCCINT).
• Depending on the operating voltage of external devices, the output driver uses
either a 5-V or a 3.3-V supply (VCCIO).

BITS Pilani, Pilani Campus


Switch Matrix
• Theoretically, a CPLD’s programmable interconnect should allow any internal PLD output or
external input pin to be connected to any internal PLD input. Likewise, it should allow any
internal PLD output to connect to any external out- put pin.
• The programmable interconnect is usually based on either array-based interconnect or
multiplexer-based interconnect:
• Array-based interconnect allows any signal within the programmable interconnect to
connect to any logic block within the CPLD. This is achieved by allowing horizontal and
vertical routing within the programmable interconnect and allowing the crossover points to
be connected or unconnected (the same idea as with the PLA and PAL), depending on the
CPLD configuration.
• Multiplexer-based interconnect uses digital multiplexers connected to each of the macrocell
inputs within the logic blocks. Specific signals within the programmable interconnect are
connected to specific inputs of the multiplexers. It would not be practical to connect all
internal signals within the programmable interconnect to the inputs of all multiplexers due
to size and speed of operation considerations.

BITS Pilani, Pilani Campus


Switch Matrix
XC 95108
• 108 internal macrocell outputs and 108 external-pin inputs – 216 signals to
switch matrix
• 6 FBs with 36 inputs each requires switch matrix to have 216 outputs, each
of which is a 216-input MUX driving one input of FBs AND array
• Building such a switch matrix using array based approach would require 216
columns for each input and 216 rows for each output and a pass transistor
for each cross point to control whether an input connects to the output or
not – not suitable design choice as large number of transistors connected to
each row or column makes for high capacitance, which makes for slow
speed. Therefore, CPLD manufacturers look for ways to reduce the size of
the switch matrix
• not every switch-matrix input must be connectable to every output. We only
need every input to be connectable to some input of each FB, since any FB
input can connect to any AND gate within the FB’s AND array.
• A MUX based approach maybe a better idea.
• Again, the requirement is not quite that simple
• Suppose that the switch-matrix output for FB input ‘𝑖𝑖’ (0 ≤ 𝑖𝑖 ≤ 35) is
created by a simple 8-input multiplexer whose inputs are switch-matrix
inputs 8i through 8i + 7.
• There are many interconnection patterns that cannot be accommodated by
such a switch matrix.
• For example, connecting switch-matrix inputs 0-35 to the same FB would not
be possible. As soon as switch-matrix input 0 is connected to FB input 0,
switch-matrix inputs 1–7 are blocked.
MUX based
Array based
BITS Pilani, Pilani Campus
Switch Matrix

• Thus, we need to redefine switch-matrix connectivity.


• For each FB, any combination of switch-matrix inputs must be
connectable to some combination of FB’s inputs.
• A typical CPLD switch matrix is a compromise between the
minimal multiplexer scheme and a full, nonblocking crosspoint
array.
• Finding complete set of connections is a NP-complete problem
• For different CPLD architectures, with varying connectivity
schemes, programming times may vary
• Thus, the design of a CPLD’s switch matrix is a compromise
between chip performance (speed, area, cost) and design tool
capabilities.

BITS Pilani, Pilani Campus


Field-Programmable Gate Arrays
Xilinx FPGAs are based on Configurable Logic Blocks (CLBs). More generally called
Programmable logic cells

BITS Pilani, Pilani Campus


Programmable Logic Cells
All FPGAs contain a basic programmable logic cell replicated in a
regular array across the chip
– configurable logic block, logic element, logic module, logic unit, logic array block, …
– many other names
There are three different types of basic logic cells:
– multiplexer based
– look-up table based
– programmable array logic (PAL-like)

BITS Pilani, Pilani Campus


Logic Cell Considerations
How are functions implemented?
– fixed functions (manipulate inputs only)
– programmable functionality (interconnect components)

Coarse-grained logic cells:


– support complex functions, need fewer blocks, but they are bigger so less of them on chip

Fine-grained logic cells:


– support simple functions, need more blocks, but they are smaller so more of them on chip

BITS Pilani, Pilani Campus


Logic Cell Considerations
When designing (or selecting) the type of logic cell
for an FPGA, some basic questions are important:
How many inputs?
How many functions?
– all functions of n inputs or eliminate some combinations?
– what inputs go to what parts of the logic cell?
Any specialized logic?
– adder, etc.
What register features?

BITS Pilani, Pilani Campus


FPGA Architectures
How can we implement any circuit in an FPGA?
– First, focus on combinational logic
– Example: Half adder
• Combinational logic represented by truth table
• What kind of hardware can implement a truth table?

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Implement truth table in small memories (LUTs)
– Usually SRAM

Logic inputs connect to


address inputs, logic
output is memory output

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Alternatively, could have used a 2-input, 2-output LUT
– Outputs commonly use same inputs

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Full adder
– Combinational logic can be implemented in a LUT with same number of
inputs and outputs
• 3-input, 2-ouput LUT

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Why aren’t FPGAs just a big LUT?
– Size of truth table grows exponentially based on # of inputs
• 3 inputs = 8 rows, 4 inputs = 16 rows, 5 inputs = 32 rows, etc.
– Same number of rows in truth table and LUT
– LUTs grow exponentially based on # of inputs
Number of SRAM bits in a LUT = 2i * o
– i = # of inputs, o = # of outputs
– Example: 64 input combinational logic with 1 output would require 264
SRAM bits
• 1.84 x 1019
Clearly, not feasible to use large LUTs
– So, how do FPGAs implement logic with many inputs?

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Fortunately, we can map circuits onto multiple LUTs
– Divide circuit into smaller circuits that fit in LUTs (same # of inputs and
outputs)
– Example: 3-input, 2-output LUTs

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
What if circuit doesn’t map perfectly?
– More inputs in LUT than in circuit

• Truth table handles this problem


• Unused inputs are ignored
– More outputs in LUT than in circuit

• Extra outputs simply not used


– Space is wasted, so should use multiple outputs whenever possible

BITS Pilani, Pilani Campus


Look-up-tables (LUTs)
Important Point
– The number of gates in a circuit has no effect on the mapping
into a LUT
• All that matters is the number of inputs and outputs
• Unfortunately, it isn’t common to see large circuits with a few inputs

1 gate 1,000,000 gates

Both of these circuits can be implemented in a


single 3-input, 1-output LUT
BITS Pilani, Pilani Campus
Sequential Logic
Problem: How to handle sequential logic
– Truth tables don’t work

Possible solution:
– Add a flip-flop to the output of LUT

3-in, 1-out 3-in, 2-out


LUT LUT
etc.

FF FF FF

BITS Pilani, Pilani Campus


Sequential Logic
Example: 8-bit register using 3-input, 2-output LUTs
– Input: x, Output: y

x(7) x(6) x(5) x(4) x(3) x(2) x(1) x(0)

3-in, 2-out 3-in, 2-out 3-in, 2-out 3-in, 2-out


LUT LUT LUT LUT

FF FF FF FF FF FF FF FF

y(7) y(6) y(5) y(4) y(3) y(2) y(1) y(0)

– What does LUT need to do to implement register?

BITS Pilani, Pilani Campus


Sequential Logic
Example, cont.
– LUT simply passes inputs to appropriate output

BITS Pilani, Pilani Campus


Sequential Logic
Isn’t it a waste to use LUTs for registers?
YES! (when it can be used for something else)
– Commonly used for pipelined circuits
• Example: Pipelined adder

+ + 3-in, 2-out 3-in, 2-out


LUT LUT ....
Register Register
FF FF FF FF
+
Register Adder and output register combined – not
a separate LUT for each
BITS Pilani, Pilani Campus
Sequential Logic
Existing FPGAs don’t have a flip flop connected to LUT outputs
Why not?
– Flip flop has to be used!

• Impossible to have pure combinational logic


– Adds latency to circuit

Actual Solution:
– Configurable Logic Blocks (CLBs)

BITS Pilani, Pilani Campus


Configurable Logic Blocks (CLBs)
CLBs: the basic FPGA functional unit
– First issue: How to make flip-flop optional?

• Simplest way: use a mux


– Circuit can now use output from LUT or from FF
– Where does select come from?

3-in, 1-out
LUT
CLB
FF

2x1

BITS Pilani, Pilani Campus


Configurable Logic Blocks (CLBs)
CLBs usually contain more than 1 LUT
– Why?
• Efficient way of handling common I/O between adjacent LUTs
• Saves routing resources

2x1

3-in, 2-out 3-in, 2-out


CLB LUT LUT

FF FF FF FF

2x1 2x1 2x1 2x1

BITS Pilani, Pilani Campus


Configurable Logic Blocks (CLBs)
Example: Ripple-carry adder
– Each LUT implements 1 full adder
– Use efficient connections between LUTs for carry signals
A(1) B(1) A(0) B(0) Cin(0)

Cin(1)
2x1

3-in, 2-out 3-in, 2-out


CLB LUT LUT

FF FF FF FF

2x1 2x1 2x1 2x1


Cout(0)

Cout(1) S(1) S(0)

BITS Pilani, Pilani Campus


Configurable Logic Blocks (CLBs)
CLBs often have specialized connections between adjacent
CLBs
– Further improves carry chains
– Avoids routing resources
Some commercial CLBs even more complex
– Xilinx Virtex 4 CLB consists of 4 “slices”
• 1 slice = 2 LUTs + 2 FFs + other stuff
• 1 Virtex 4 CLB = 8 LUTs
– Altera devices has LABs (Logic Array Blocks)
• Consist of 16 LEs (logic elements) which each have 4 input LUTs

BITS Pilani, Pilani Campus


What Else?
Basic building block is CLB
– Can implement combinational+sequential logic
– All circuits consist of combinational and sequential logic

So what else is needed?

BITS Pilani, Pilani Campus

You might also like