You are on page 1of 6

Field Programmable Gate Arrays

Index
References: https://www.youtube.com/@IntelFPGA/playlists (Intel Official playlists)
References: FPGA Tutorials

Introduction
Reference: youtube.com/watch?
v=gUsHwi4M4xE&list=PLvOlSehNtuHuWNxksxL54Shr1G5JnBGr6&index=1

FPGAs are integrated circuits that can be used to implement any digital function. Unlike pre-
fabricated circuits, which usually have their own functions, FPGAs usually have no inherent function.
It can be programmed to make it function the way it needs to. This flexibility allows FPGAs to
implement logical circuits, like signal processors, driver circuits, or even microcontrollers.

Internals of FPGAs
Reference: https://www.rapidwright.io/docs/FPGA_Architecture.html

The internals of the FPGA consist of something like this:

The FPGAs consist of the following:


CLB: Configurable Logic Blocks. These blocks can behave like any gate and also has some additional
functions. The CLBs are also surrounded by interconnecting wires, which help to connect the CLBs,
thus helping us implement any digital circuit.

IOB: Input Output Blocks. IOBs act as the peripherals of the FPGAs for input and outputs. We can
program the PPGAs so that the CLBs behave like a specific type of logic circuit and also use the
interconnect lines to connect the CLBs and then configure the IOBs to act like the input or output
ports. Other peripherals include clocks etc.

The FPGA also consists of components which include: (Their hierarchy and its relation with CLBs is
mentioned in the Xilinx architecture later in the doc.

 Lookup Tables (LUT): These are a combination of registers and multiplexers. Usually, the
mux has six select lines and a 64-bit register. We can store any combination of 64 bits in the
register. We can select one of the 64 bits and give it as the output based on the input of the
six select lines. By carefully choosing the values stored in the register, we can make the
combination of mux and register (lookup table) behave like any gate or basic digital function.
The LUT can represent any 6-input truth table.
 State Elements (Like Flipflops): Usually, data, when output by the LUT mux, is either used as
a signal to another LUT in the same clock cycle or is stored to be used in another clock cycle.
To store bits, flip flops are present. These have separate clock lines to prevent clock skewing.
 Carry Chains: These additional circuits are provided with a collection of LUTs. So say, 10 LUTs
will together share a common carry chain. These circuits include adders, subtracters,
comparators, and special circuits that do basic mathematical operations. When done using
normal LUTs, these operations can take up much area on the FPGA fabric.
 DSP Blocks: Similar to the carry chains, we can also have special circuits common for
multiple LUTs to carry out DSP operations like multiplication and division. These operations,
when implemented on the LUTs, can take up much area in the FPGA fabric so it saves
resources and delay.
 Block RAMs: FPGAs are also provided with some Block rams that can store data for more
extended periods. These are larger storages, usually 16 or 32 KBs. Again, these are scattered
in the FPGA fabric and used by multiple blocks.

FPGA Design Flow


Reference: https://www.youtube.com/watch?
v=yG3dBx8kToM
Reference:
https://hardwarebee.com/understanding-fpga-
programming-and-design-flow/

The steps involved in doing a new project on vivado


which are essentially the steps required for
reprogramming an FPGA are:

 Selecting Project Type: RTL type project is


selected. The project is chosen as RTL since
the programming of the FPGA will be done
using RTL programming languages (Verilog).
This stage allows importing of other projects
and also allows importing projects from other sources.

 Adding corresponding Verilog file: At this stage, the Verilog files are added to the project,
which helps us to program the FPGA. These are called source files. The entire design is
divided into multiple modules. These modules can be reused just like in the Verilog codes.
Here the source files are usually HDL files in Verilog or VHDL.

 Creating constraint files: This tells which input-output is connected to which pin (IOB) of the
FPGA. This is an XDC file. This also gives the clock frequency and other such details.

 Selection of the board: This allows us the selection the board to be simulated on.

 Adding additional sources/test benches: Here, we can add additional sources in case we
need them. We can also add simulation sources if we need to add test benches. These
sources will be executed during simulation (i.e., Test benches) and will not be programmed
on the FPGAs.

 Behavioral simulation: This allows us to run the simulation sources (test benches) to see if
the sources run. This gives out the timing diagrams for us to see and verify. Once we verify
the timing diagrams, we can move to the next step. This process is the behavioural
simulation of the system.

Netlist: It is basically the gate-level description of an RTL code. So, when we build a digital circuit
using RTL codes (like Verilog or VHDL), we need to transform the circuit into gate level circuit.
The final output of this process, where RTL codes are converted to gate-level schematics, is
called gate-level netlist, and this process of conversion is called synthesis.

 Run Synthesis: The above constraints are fed into the design sources at this stage. The
constraints have detailed information regarding clock frequencies, IOB pin configurations,
register requirements, and other physical constraints. Using these constraints and the
modules in the design sources, the synthesis process generates gate-level netlists for each
module of the design. The tools also try to map the parts of the existing netlists into pre-
existing libraries so as to optimize and reduce the logic.
At the end of this process, we get gate-level netlists of the given design and information
regarding CLB usages, flipflop usages, register usages, etc.

 Run Implementation: It includes 3 steps:

o Translate: Here all the netlists and the constraints files are combined
together into one single file which completely describes the entire
design. This includes assigning FPGA pins, switches, ports and LEDS of
the FPGA board. This translation is done according to the chosen
board.
o Map: Here the entire netlist is divided into smaller blocks. The FPGA is
also divided into smaller blocks. This process maps the entire design
onto FPGA blocks.
o Placement and Routing: This step involves the placement of the netlist in the blocks.
This involves populating the block RAMs and LUTs of the Configurable Logic Blocks
(CLBs) of the blocks of the FPGA. This is followed by routing the CLBs with any carry
chains, DSPs, or registers that might be needed and routing the different blocks of
the FPGAs.

 Static Timing Analysis: This is done to see if the design matches the timing constraints. This
includes finding the critical path (Datapath, which has the maximum delay). This is done by
observing the maximum delay of the critical path. It also sees if the design meets the hold
time and setup time as detailed in the timing constraints. In case of any breach, we may
have to resynthesize or re-implement the designs.

 Generate Bitstream: It allows the production of the bitstream, which has the entire
information required to produce the information to program the FPGA, which is then
downloaded on the hardware to program the FPGA.

 In-circuit Verification: This involves the actual testing on the FPGA hardware. The Vivado
software also allows users to set register values to simulate the circuit manually.

Xilinx FPGA Architecture


Reference : https://docs.xilinx.com/r/en-US/ug912-vivado-properties/CELL

Reference: https://www.rapidwright.io/docs/Xilinx_Architecture.html

Cells of the Netlist:


The netlist consists of leaf cells and the
hierarchical cells.

Leaf Cells: are the most basic logic element


of the netlist. We cannot have any further
simpler logic element in the netlist. The leaf
cells in the netlist can represent basic and/or
operations or more complex
flip-flops/adder/multiplier elements. Leaf
cells usually have pins to connect the cell to
the other parts of the netlist.

Hierarchical cells: These are basically the


combination of multiple leaf cells to form a
macro logic. These cells have ports / pins to
connect them to the remaining parts of the netlist.

The Hierarchy of the FPGAs followed in the Xilinx FPGAs are:

 BEL: (Basic Element Logic): These are the smallest logic elements in the FPGAs. The CLBs
(Configurable logic blocks) are also a type of BEL. The BEL mainly has two types:
o Logic BEL: These BELs are mostly used for some computations. These may be carry
chains, DSP blocks, LUTs or even flipflops. In the placement phase of the FPGA
programming, the leaf cells of the synthesized FPGAs are placed on the logic BELs of
the FPGAs.
o Routing BELs: These BELs connect multiple Logic BELs to one another. These are
basically used to make interconnects between the BELs using the interconnecting
wires.
o Site PIP (Programming Interconnect Point): Each BEL has input and output pins.
Sometimes it may be required to connect a particular input pin of the BEL directly
to the output pin (This is used when the BEL has no function, so the input signal is to
be relayed to the output signal). Special connections in each BEL directly connect the
input to the output of the BEL. These connections are called site PIPs. In a logic BEL
for the site PIPs to make a connection from the input directly to the output, it is
necessary that the logic BEL does not implement any function. This implies that the
BEL is essentially useless, and it is called “route-through.”

 Site: A site consists of BEL (Logic and routing), site pins, and site wires.

A tile (mentioned later) has a site grid. A site grid is basically a collection of multiple sites of
different/same site types. Each site type can have its XY coordinate grid independent of the
surrounding site types. A site is named siteType_X#Y#. Here the X and Y coordinate
numbers are the coordinates in XY coordinate grid of the site type mentioned in siteType.
This can be better explained using the below diagram (1).
(An exception to the above rule is the SLICEL and SLICEM site types.)

 Tile: The entire FPGA consists of tiles. Like the sites, the tiles are of different types. Not all
tiles have site grids, but some of them do. A tile grid also consists of tiles of the same or
different types. Each tile type can have XY coordinates independent of the surrounding tile
types, and individual tiles are named by tileType_X#Y#.

BELs and Sites have pins (i/o), but the tiles have no pins. Rather the tiles expose the pins of
the sites and other components which it includes. So other components of other tiles can
connect to the components of this tile using the exposed pins. This includes the PIP pins
mentioned above.
(Usually, all tiles in the same column have the same type, and those on the same row are
usually different).

FPGA Routing
The routing in the FPGA are of many types.

• Pins: These are the user visible pins which connect the component to the other components.
In the FPGAs only sites and BELs have user visible pins. The pins help to connect the site
with other sites and the BEL with other bels.

• Site Wires: These wires exist within the site. These connects one site pin to a bel pin within
the site or may connect two bel pins (both of which exist within the site)
• Nodes: Since tiles donot have pins like the sites / bels, nodes are basically wires which
connect site pins to other site pins of same or different tiles. The same node wire may
connect multiple site pins of multiple sites. Usually one of the site is a placement site and
others are switchbox sites. The entire node is electricallly equivalent.

• PIPs: They are mentioned above.

Clocking in FPGAs
Reference: https://docs.xilinx.com/r/en-US/ug949-vivado-design-methodology/Clock-Routing-
Root-and-Distribution

The steps to produce clock of different frequencies in the ASIC chip are:

 Voltage oscillations are usually produced using an XTAL crystal oscillator.


 The oscillations are then taken by a clock generator which generates clock signals of a
particular frequency (Usually high frequency).
 This clock is then sent to the individual PLL (Phased locked loops) circuitry which derives
lesser frequencies from the input frequencies.
 This lower frequency is then sent to clock buffers which output the same frequency, but it
multiplies one input clock to multiple clocks. This prevents power gating in case multiple
devices use the clock at the same time. The output of the clock buffers can be sent to other
buffers for further multiplication.

The clock generation mechanism mentioned in the above circuit is usually the same across all
ASICs. For FPGA (specifically, ultrascale circuits continue).

 Fabric Sub Region (FSR): Clock region: Multiple tiles combine together to form a Fabric
sub-region. A single clock source of a similar frequency feeds the entire FSR. So, all the
components in the same FSR have the same frequency. The Vivado chooses a clock root
that receives the clock from the FSR clock buffer using the routing interconnects. This
root then distributes the clock to all other components in the FSR which need the clock
using distribution interconnects.

For reference, each FSR has 24 horizontal and 24 vertical routing interconnects, and 24
horizontal and vertical distribution interconnects.

 Super Logic Regions (SLR): These are basically combinations of multiple FSR stacked one
upon the other. This is produced by silicon interposition

 Device: This is basically the entire device.

You might also like