You are on page 1of 47

Reconfigurable Computing

CS G553

Dr. A. Amalin Prince


BITS - Pilani K K Birla Goa Campus
Department of Electrical and Electronics Engineering

‹#›
Lecture – 22
Reconfigurable Computing Device: Altera Stratix II and Xilinx
Virtex-5 and 7

CS G553 2
FPGA Market Share 2013

CS G553 3
VIRTEX VS STRATIX

We have some idea about V5 architecture, let me include some stratix II details
Followed by v7 details
CS G553 4
STRATIX II Logic Fabric

CS G553 5
ALM Flexibility

CS G553 6
ALM Flexibility

CS G553 7
The ALM Advantage

Comparing the Stratix II ALM and the Virtex-5 LUT-Flipflop Pair

CS G553 8
The ALM Advantage

ALM vs. Virtex-5 LUT Flexibility

CS G553 9
The ALM Advantage

Implementing 5- and 3-Input Functions in Stratix II ALM and Virtex-5 LUT-Flipflop Pair

CS G553 10
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 11
7-Series Architecture Alignment

Common elements
enable easy IP reuse
for quick design
portability across all 7-
series families
o Design scalability from
low-cost to high-
performance
o Expanded eco-system
support
o Quickest time to
market

Artix-7 Architecture
CS G553 Overview 12
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 13
Configurable Logic Block (CLB) in 7-Series
FPGAs

Primary resource for design


in Xilinx FPGAs
o Combinatorial functions
o Flip-flops
CLB contains two slices
Connected to switch matrix
for routing to other FPGA
resources
o Carry chain runs vertically
in a column from one
slice to the one above

CS G553 14
Two Types of CLB Slices

Two types of CLB slices


o SLICEM: Full slice
 LUT can be used for logic and memory/SRL
 Has wide multiplexers and carry chain
o SLICEL: Logic and arithmetic only
 LUT can only be used for logic (not memory)
 Has wide multiplexers and carry chain

CS G553 15
Slice Resource

Four six-input Look-Up


Tables (LUT)
Multiplexers
Carry chains
SRL
o Cascade path is not
shown
Four flip-flops/latches
o Four additional flip-flops
The implementation tool will
pack multiple slices in the
same CLB if certain rules
are followed
CS G553 16
6-Input LUT with Dual Output

LUTs can be two 5-input LUTs


with common input
o Minimal speed impact to a 6-
input LUT
o One or two outputs
Any combinatorial function of
six variables or two functions
of five variables

CS G553 17
Wide Multiplexers
Each F7MUX combines the outputs of
two LUTs together
o Can implement an arbitrary 7-input
function
o Can implement an 8-1 multiplexer
The F8MUX combines the outputs of
the two F7MUXes
o Can implement an arbitrary 8-input
function
o Can implement a 16-1 multiplexer
MUX is controlled by the BX/CX/DX
slice input
MUX output can drive out
combinatorially or to the flip-flop/latch

CS G553 18
Carry Chain

Carry chain can implement fast


arithmetic addition and
subtraction
o Carry out is propagated
vertically through the four LUTs
in a slice
o The carry chain propagates from
one slice to the slice in the same
column in the CLB above
Carry look-ahead
o Combinatorial carry look-ahead
over the four LUTs in a slice
o Implements faster carry
cascading from slice to slice

CS G553 19
Slice Flip-Flops and Flip-Flop/Latches

Each slice has four flip-flop/latches


(FF/L)
o Can be configured as either flip-flops or
latches
o The D input can come from the O6 LUT
output, the carry chain, the wide
multiplexer, or the AX/BX/CX/DX slice
input
Each slice also has four flip-flops (FF)
o D input can come from O5 output or the
AX/BX/CX/DX input
• These don’t have access to the
carry chain, wide multiplexers, or the
slice inputs
If any of the FF/L are configured as
latches, the four FFs are not available

CS G553 20
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 21
7-Series Block RAM and FIFO

All members of the 7-series


families have the same Block
RAM/FIFO
Fully synchronous operation
o All operations are
synchronous; all outputs are
latched
Optional internal pipeline register
for higher frequency operation
Two independent ports access
common data
o Individual address, clock, write
enable, clock enable
o Independent data widths for
each port
CS G553 22
7-Series DSP48E1 Slice

CS G553 23
Why FPGA for Signal Processing? Communication?

CS G553 24
7 Series Capability

CS G553 25
DSP Performance through the DSP48E1 Slice
Virtex-6, Artex-7, Kintex-7, Virtex-7

CS G553 26
Pre-Adder

CS G553 27
Greater Flexibility with Fully Independent
Multipliers

CS G553 28
25x18 Multiplier

CS G553 29
Efficient Rounding Modes using Pattern
Matching

CS G553 30
One Accumulator for each Multiplier

CS G553 31
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 32
7-Series FPGA I/O

 Wide range of voltages


o 1.2V to 3.3V operation
 Wide I/O standards support
o Single ended and differential
o Referenced voltage inputs
o 3-state capability
 Very high performance
o Up to 1600 Mbps LVDS
o Up to 1866 Mbps single-ended for DDR3
 Easy memory interfacing
o Hardware support for QDRII+ and DDR3
 Digitally controlled impedance
 Power reduction features

CS G553 33
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 34
XADC and AMS

XADC is a high quality and flexible analog interface new to the 7-


series
o Dual 12-bit 1Msps ADCs, on-chip sensors, 17 flexible analog
inputs, and track & holds with programmable signal conditioning
o 1V input range
o 16-bit resolution conversion
o Built in digital gain and offset calibration
Analog Mixed Signal (AMS)
o Using the FPGA programmable logic to customize the XADC
and replace other external analog functions; for example,
linearization, calibration, filtering, and DC balancing to improve
data conversion resolution

CS G553 35
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 36
7-Series FPGAs Clock Management
Global clock buffers
o High fanout clock distribution buffer
Low-skew clock distribution
o Regional clock routing
Clock regions
o Each clock region is 50 CLBs high and spans half
the device
Clock management tile (CMT)
o One Mixed-Mode Clock Managers (MMCMs) and
one Phase Locked Loop (PLL) in each Clock
o Performs frequency synthesis, clock de-skew, and
jitter-filtering
o High input frequency range
Simple design creation through the Clocking
Wizard

CS G553 37
Outline

Introduction to 7-Series FPGA


Logic Resources
Memory and DSP48 Resources
I/O Resources
XADC
Clocking Resources
Zynq SoC
Summary

CS G553 38
Zynq-7000 Family Highlights
Complete ARM®-based processing system
o Application Processor Unit (APU)
• Dual ARM Cortex™-A9 processors
• Caches and support blocks
o Fully integrated memory controllers
o I/O peripherals
Tightly integrated programmable logic
o Used to extend the processing system
o Scalable density and performance
Flexible array of I/O
o Wide range of external multi-standard I/O
o High-performance integrated serial transceivers
o Analog-to-digital converter inputs

CS G553 39
The PS and the PL

The Zynq-7000 AP SoC architecture consists of two major


sections
o PS: Processing system
• Dual ARM Cortex-A9 processor based
• Multiple peripherals
• Hard silicon core
o PL: Programmable logic
• Shares the same 7-series programmable logic as
– Artix™-based devices: Z-7010 and Z-7020 (high-range I/O banks only)
– Kintex™-based devices: Z-7030 and Z-7100 (mix of high-range and
high-performance I/O banks)

CS G553 40
INTEL® AGILEX™ FPGAS AND SOCS
Intel® Agilex™ FPGA family leverages heterogeneous 3D system-in-package (SiP) technology to
integrate Intel’s first FPGA fabric built on 10nm process technology and 2nd Gen Intel®
Hyperflex™ FPGA Architecture to deliver up to 40% higher performance1 or up to 40% lower
power1 for applications in Data Center, Networking, and Edge compute. Intel® Agilex™ SoC
FPGAs also integrate the quad-core Arm* Cortex-A53 processor to provide high system integration.

CS G553 41
Xilinx ACAP

7nm FinFET, Versal ACAP, a fully software-programmable, heterogeneous compute platform that combines Scalar
Engines, Adaptable Engines, and Intelligent Engines to achieve dramatic performance improvements of up to 20X over
today's fastest FPGA implementations and over 100X over today's fastest CPU implementations—for Data Center,
wired network, 5G wireless, and automotive driver assist applications.

CS G553 42
Xilinx ACAP

Types of Compute Engines

CS G553 43
Xilinx ACAP

Heterogeneous Integration of Three Types of Programmable Engines

CS G553 44
Xilinx ACAP

Xilinx Versal ACAP Functional Diagram

CS G553 45
Device size
 Usually measure in the number of transistor used in the device

 This is not so helpful for reconfigurable devices, since the


number of transistors is not the number of usable resource in
the chip. For example: FPGA are one of the most complex chip
(complexer than Pentium processors), but their capacity is
smaller than their ASIC counterpart.

 The Capacity of FPGA is usually measured in term of the


number of Gates equivalent a design need to be implemented.

 A gate equivalent is a unit of measure.


1 gate equivalent = 1 2-inputs NAND gate

 A one million-gates FPGA is able to implement the equivalent


of a circuit containing 1 million 2-inputs NAND gates

CS G553 46
The End

 Questions ?

 Thank you for your attention

CS G553 47

You might also like