You are on page 1of 44

Outline

Field Programmable Gate Arrays


Historical perspective

Programming Technologies Architectures


PALs, PLDs, and CPLDs FPGAs
Programmable logic Interconnect network I/O buffers Specialized cores

Programming Interfaces
C. Stroud 8/06 FPGAs 1

History
Programmable Logic Arrays ~ 1970
Implement any set of sumsum-of of-products logic equations Incorporated in VLSI devices

Programmable Logic Devices ~ 1980


MMI Programmable Array Logic (PAL)
16L8 combinational logic only 16R8 sequential logic only

AMD 22V10 and Lattice 16V8 Complex PLDs arrays of PLDs with routing network

Field Programmable Gate Arrays ~ 1985


Xilinx Logic Cell Array (LCA)

CPLD & FPGA architectures became similar ~2000


Incorporation of RAMs and other specialized cores
Programmable systemsystem-onon-chip
C. Stroud 8/06 FPGAs 2

Programming Technologies
PLAs were mask programmable PALs used fuses for programming Early PLDs & CPLDs used floating gate technology
Erasable Programmable Read Only Memory (EPROM)
Ultra-violet erasable (UVEPROM) UltraElectrically erasable (EEPROM) Flash memory came later and was used for CPLDs

FPGAs used RAM for programming Later trends


Fuses were replaced with antianti-fuses
Better reliability

Large CPLDs went to RAMRAM-based programming


C. Stroud 8/06 FPGAs 3

Programming Technologies
RAM
Volatile must configure after powerpower-up InIn -System Re Re-programmable (ISR) RunRun -Time Reconfiguration (RTR)
dynamic reconfiguration while system is operating

Floating gate technologies


NonNon -volatile but rere-usable
UV EPROM, EEPROM, and flash memory

InIn -System Programmable (ISP)


EEPROM and flash memory

InIn -System Re Re-programmable (ISR)


Flash memory

Fuse/antiFuse/anti -fuse
Non-volatile but not re Nonre-usable One Time Programmable (OTP)
C. Stroud 8/06 FPGAs 4

PALs
16L8 combinational logic 10 to 16 inputs, each with true and complement signal 2 to 8 outputs, each with
7 product terms can AND any of up to 16 inputs or their complements TriTri-state control product term for inverting output buffer When output in tritri-state, I/O pin can be used as input
High impedance output with no signal driven

C. Stroud 8/06

FPGAs

PALs
16R8 sequential logic 8 inputs, each with true & complement 8 outputs, each with D flipflip-flop
With feedback for FSMs

8 product terms that can AND any of:


8 inputs or their complements 8 feedbacks or their complements from D flipflipflops

One clock for all FFs One tritri-state control for all outputs
C. Stroud 8/06 FPGAs 6

PLDs
22V10 replaced all PALs Combinational and/or sequential logic
Macrocell program bits C0, C1

Up to 22 inputs w/complement Up to 10 outputs, each with


Macrocell 8-16 product terms TriTri-state control product term

Global
preset & clear PTs clock

C. Stroud 8/06

FPGAs

PLDs 16V8 Up to 16 inputs (bit & bitbar) Up to 8 outputs, each with


8 product terms (PTs), or 7 with tritri-state control (PT) Macrocell similar to 22V10
More programming options

Ability to select adjacent pin


Allows embedded registers

C. Stroud 8/06

FPGAs

CPLDs
Cypress Semiconductor 374 CPLD Architecture 84-pin package w/~6 Vcc and 8 Gnd pins 36 inputs to AND-plane w/84 PTs and partially programmable OR-plane

C. Stroud 8/06

FPGAs

CPLDs
An array of PLDs
Global routing resources for connections
PLDs to other PLDs PLDs to/from I/O pins
I/O Block I/O Block
GCLK[3:0]

I/O Block

I/O Block

LB LB LB LB
8192 bit RAM

LB LB
PIM

LB LB
8192 bit RAM

4096 bit RAM Dual-Port FIFO

LB LB LB LB
8192 bit RAM

LB LB
PIM

LB LB
8192 bit RAM

4096 bit RAM Dual-Port FIFO

LB LB LB LB
8192 bit RAM

LB LB
PIM

LB LB
8192 bit RAM

4096 bit RAM Dual-Port FIFO

Each Logic Block (LB) similar to a 22V10 Each cluster of 8 LBs has two 8K RAMs & one 4K dualdual-port RAM/FIFO
Programmable Interconnect Modules (PIMs) provide interconnections

GCLK[3:0]

LB LB LB LB
8192 bit RAM

LB
PIM

LB
4096 bit RAM Dual-Port FIFO

LB
PIM

LB
4096 bit RAM Dual-Port FIFO

LB
PIM

LB LB LB
8192 bit RAM

LB LB LB
8192 bit RAM

LB LB LB
8192 bit RAM

LB LB LB
8192 bit RAM

LB LB LB
8192 bit RAM

4096 bit RAM Dual-Port FIFO

I/O Block

Array of up to 24 clusters with global routing


C. Stroud 8/06

I/O Block
4 4 PLLs &Clock Mux

I/O Block
GCLK[3:0] CNTL[3:0]

I/O Block

FPGAs

10

I/O Block

I/O Block

Example: Cypress 39K

Ranges of Resources
FPGA Resource Logic Routing Specialized Cores Other
PLBs per FPGA LUTs and flipflip-flops per PLB Wire segments per PLB PIPs per PLB Bits per memory core Memory cores per FPGA DSP cores Input/output cells Configuration memory bits

Small FPGA Large FPGA 256 1 45 139 128 16 0 62 42,104 25,920 8 406 3,462 36,864 576 512 1,200 79,704,832

C. Stroud 8/06

FPGAs

11

Basic PLB Architecture


Look-up Table (LUT) implements truth table LookMemory elements:
Flip-flop/latch FlipSome FPGAs - LUTs can also implement small RAMs

Carry & control logic implements fast adders/subtractors


carry out Input[1:4] Control clock, enable, set/reset 3
C. Stroud 8/06

LUT/ RAM

Carry & Control Logic

Flip-flop/ Latch

Output Q output

carry in
FPGAs 12

A Simple PLB
Two 33-input LUTs
Can implement any 4-input combinational logic function

1 flipflip-flop
Programmable:
Active levels Clock edge D2-0 Set/reset
3 LUT C 8x1 LUT S 8x1 CB5 D2-0 LUT

C7

C6

C5

C4

C3

C2

C1

C0

111 110 101 100 011 010 001 000 out Cout

Smux 0 1 0 1 CEmux CB3 SRmux 0 1 FF CB4 SOmux 0 Sout 1

22 configuration memory bits D3


8 per LUT
C0 C0-7 S0S0-7

Clock Enable Set/Reset Clock

6 controls
CB0CB0 -7
C. Stroud 8/06 CB0 CB1 FPGAs CB2

CB

= Configuration Memory Bit 13

Combinational Logic Fucntions


Gates are combined to create complex circuits Multiplexer example
If S = 0, Z = A If S = 1, Z = B Very common digital circuit Heavily used in FPGAs
S input controlled by configuration memory bit Well see it again
C. Stroud 8/06 FPGAs

A S Z B Truth table SAB Z 000 0 001 0 010 1 011 1 100 0 101 1 110 0 111 1

Logic symbol A B 0 S1
0 1

14

LookLook -up Tables


Recall multiplexer example Configuration memory holds outputs for truth table Internal signals connect to control signals of multiplexers to select value of truth table for any given input value
C. Stroud 8/06

Multiplexer 0 0 1 1 0 1 0 1
0 1 0 0 1 1 0 0 1 1 0 1 0 1

A B S

0 1

Z 1

Truth table SAB Z 000 0 001 0 010 1 011 1 100 0 101 1 110 0 111 1

1 B
FPGAs

0 A S

15

LookLook -up Table Based RAMs


Normal LUT mode Data In ck0 performs read ck1 operations ck2 Address decoder In0 ck3 In1 with write enable In2 ck4 generates clock ck5 signals to latches for write operations ck6 ck7 Small RAMs but Write can be combined Enable for larger RAMs
Address Decoder
C. Stroud 8/06 FPGAs

0 0 1 1 0 1 0 1

0 1 0 0 1 1 0 0 1 1 0 1 0 1

In0

In1

In2
16

Interconnect Network
Wire segments of varying length
xN = N PLBs in length
1, 2, 4, and 6 are most common

xH = half the array in length xL = length of full array

Programmable Interconnect Points (PIPs)


Also known as Configurable Interconnect Points (CIPs)

Transmission gate connects to 2 wire segments Controlled by configuration memory bit Wire A
0 = wires disconnected 1 = wires connected
C. Stroud 8/06 FPGAs

config bit Wire B


17

PIPs
BreakBreak -point PIP
Connect or isolate 2 wire segments

CrossCross -point PIP


Turn corners

Multiplexer PIP
Directional and buffered Select 11-of of-N inputs for output
Decoded MUX PIP N config bits select from 2N inputs NonNon -decoded MUX PIP 1 config bit per input

Compound crosscross-point PIP


Collection of 6 breakbreak-point PIPs
Can route to two isolated signal nets
C. Stroud 8/06 FPGAs 18

Spartan 3 Routing Resources


switch matrix over 2,400 PIPs mostly MUX PIPs PLB consists of 4 slices x6 wire segments x2 wire segments xH & xL wire segments over 450 total wire segments in PLB
C. Stroud 8/06 FPGAs 19

FPGAs
Recent trend - incorporate specialized cores
RAMs single single-port, dual dual-port, FIFOs
128 bits to 36K bits per RAM 4 to 575 per FPGA

DSPs 18x18 18x18-bit multiplier, 48 48-bit accumulator, etc.


up to 512 per FPGA

Microprocessors and/or microcontrollers


up to 2 per FPGA
Hard core processor

Support soft core processors


Synthesized from HDL into programmable resources
C. Stroud 8/06 FPGAs 20

FPGA Architectures
4000/Spartan
NxN array of unit cells Unit cell = CLB + routing
Special routing along center axes

I/O cells around perimeter

Virtex/SpartanVirtex/Spartan -2
MxN array of unit cells Added block 4K RAMs at edges
PC PC

VirtexVirtex -2/Spartan2/Spartan-3
Block 18K RAMs in array Added 18x18 multipliers with each RAM Added PowerPCs in VirtexVirtex-2 Pro
PC

VirtexVirtex -4/Virtex4/Virtex-5
Added 4848-bit DSP cores w/multipliers I/O cells along columns for BGA
C. Stroud 9/07 FPGAs

PC
21

RAMs/multipliers
250 300 350 400 450 100 150 200 50 0

C. Stroud 8/06

Virtex and Spartan II


4K-bit RAMs

Specialized Cores

Virtex II and Spartan 3

18K-bit RAMs and 1818-bit multipliers

FPGAs 22

2S15 2S30 2S50 2S100 2S150 2S200 V50 V100 V150 V200 V300 V400 V600 V800 V1000 3S50 3S200 3S400 3S1000 3S1500 3S2000 3S4000 3S5000 2V40 2V80 2V250 2V500 2V1000 2V1500 2V2000 2V3000 2V4000 2V6000 2V8000 2VP2 2VP4 2VP7 2VP20 2VPX20 2VP30 2VP40 2VP50 2VP70 2VPX70 2VP100

Programmable RAMs
18 Kbit dualdual-port RAM Each port independently configurable as
512 words x 36 bits
32 data bits + 4 parity bits

1K words x 18 bits
16 data bits + 2 parity bits

2K words x 9 bits
8 data bits + 1 parity bit

4K words x 4 bits (no parity) 8K words x 2 bits (no parity) 16K words x 1 bit (no parity)

Each port has independently programmable


C. Stroud 8/06

clock edge active levels for write enable, RAM enable, reset
FPGAs

23

Cores
100 200 300 400 500 600 0 4VLX15 4VLX25 4VLX40 4VLX60 4VLX80 4VLX100 4VLX160

C. Stroud 8/06

18K bit RAMs Xtreme DSPs

Specialized Cores

FPGAs
4VLX200

Virtex 4
4VSX25 4VSX35 4VSX55 4VFX12 4VFX20 4VFX40 4VFX60 4VFX100 4VFX140

24

FPGA Configuration Memorys


PLB addressable
Good for partial reconfiguration X-Y coordinates of PLB location to be written
Requires tag to identify which resources will be configured

Frame addressable
Vertical or horizontal frame Access to all PLBs in frame
Only portion of logic and routing resources accessible in a given frame Many frames to configure PLBs
Major address for column, minor address for frame
C. Stroud 8/06 FPGAs 25

Number of 32 32-bit words per frame


200 250 300 350 100 150 50 0 X C 2 S 1 5 X C 2 S 3 0 X C 2 S 5 0 /V 5 0 /E X C 2 S 1 0 0 /V 1 0 0 X C 2 S 1 5 0 /V 1 5 0 X C V 2 0 0 /E X C V 3 0 0 /E X C V 4 0 0 /E /4 0 5 X C V 6 0 0 /E X C V 8 0 0 /8 1 2 E X C V 1 0 0 0 /E X C V 1 6 0 0 E X C V 2 0 0 0 E X C V 2 6 0 0 E X C V 3 2 0 0 E X C 2 V P 2 X C 2 V P 4 X C 2 V P 7

Day #1

Very large frame lengths for large devices

Frame Length

FPGA Verfication Course


X C 2 V P 2 0 /X X C 2 V P 3 0 X C 2 V P 4 0 X C 2 V P 5 0 X C 2 V P 7 0 /X X C 2 V P 1 0 0 X C 3 S 5 0 X C 3 S 2 0 0 X C 3 S 4 0 0 X C 3 S 1 0 0 0 X C 3 S 1 5 0 0 X C 3 S 2 0 0 0

26

X C 3 S 4 0 0 0 X C 3 S 5 0 0 0

80

Frames vs. Column Type


Virtex1/Spartan2 Virtex2pro Spartan3 Virtex4

70

60

Number of Frames

50

40

30

20

10

0 CLB IOB/TERM IOI/DSP RAMrouting RAMcontent center

Day #1

FPGA Verfication Course

27

VirtexVirtex -4 Architectures

PowerPC location

Day #1

FPGA Verfication Course

28

Tile coordinates

Tile Map for VirtexVirtex-4 LX15


IOBs RAMs CLBs DSPs center

C. coordinates Stroud 8/06 XDL

FPGAs

29

Configuration Memory
Frame order
CLBs, IOBs, DSPs, & center column form main portion BRAMs come after
XN+1 XN+2 XN+3 XN+N

Frames span 16 rows (V5=20)


2.5 words per row (V5=2) All columns have INT switch box routing
3,312 PIPs first 18.5 frames
N+1 1 (X+1)N +1 (X+2)N +1 N+2 2 (X+1)N +2 (X+2)N +2 N+3 3 (X+1)N +3 (X+2)N +3 2N N (X+1)N +N (X+2)N +N

Total frames/column
CLBs = 22 frames DSPs = 21 columns Center column = 33 frames IOBs = 30 frames
Left & right cols in LX & SX

(2X+1)N +1

(2X+1)N +2

(2X+1)N +3

(2X+1)N +N

BRAMs & GTs = 20 frames

2 frames at end of row


C. Stroud 8/06 FPGAs

N = # columns X = (# rows/16)rows/16)-1
30

VirtexVirtex -5 Architectures
Similar architecture, frame structure and order
I/O cells not along outside column on right side Center column (Xs) not in center of array
More columns to right side of center column

Similar top/bottom and config row format 41 words (32(32-bit) per frame
Hamming bits in middle word of frame
part #rows 30 80 50 120 85 120 110 160 220 160 330 240 35 50 95 LX & LXT Legend O 4 R2D 8 X 8 R 4 O 4 TC #=#CLBcols O 4 R2D 8 X 8 R 4 O 4 TC D=DSPs O 4 R 10 R 2 D 8 X 12 R 10 R 4 O 4 T C R=RAMs O 4 R 10 R 2 D 8 X 12 R 10 R 4 O 4 T C O=I/O cells O 4 R 22 R 2 D 2 D 2 R 20 X 20 R 6 R 22 R 4 O 4 T C X=IO&DCM O 4 R 22 R 2 D 2 D 2 R 20 X 20 R 6 R 22 R 4 O 4 T C T/C=T only SXT 80 O4R2D2D 2 R2D 2 D2R 2 X 2 R 2 D 2 D 2 R 4O4 TC 120 O4R2D2D 2 R2D 2 D2R 2 X 2 R 2 D 2 D 2 R 4O4 TC O4R2D2D2R2D2D 2 R2D 2 D2R 2 X 2 R 2 D 2 D 2 R 2 D 2D2R4O4TC 160
FPGAs 31

C. Stroud 9/07

VirtexVirtex -5 FX30T
5,120 slices
4 FFs & 66-input LUTs

68 DPRAMs/FIFOs
36Kbits

64 DSPs
24x18 mult & 4848-bit ALU

1 PowerPC 440 1 PCI Express 4 Ethernet MACs 8 Gigabit Xceivers


C. Stroud 9/07 FPGAs 32

FPGA Configuration Memorys


PLB addressable
Good for partial reconfiguration X-Y coordinates of PLB location to be written
Requires tag to identify which resources will be configured

Frame addressable
Vertical or horizontal frame Access to all PLBs in frame
Only portion of logic and routing resources accessible in a given frame Many frames to configure PLBs
C. Stroud 8/06 FPGAs

Hybrid i.e.: Virtex-4 Virtex-5 Major address for column, minor address for frame Virtex-6
33

Tile coordinates

Tile Map for VirtexVirtex-4 LX15


IOBs RAMs CLBs DSPs center

C. coordinates Stroud 8/06 XDL

FPGAs

34

Master FPGA retrieves its own configuration from ROM after powerpower-up
clock CCLK CCLK

Configuration Interfaces
PROM with Configuration Data
data out

CCLK

Serial or Parallel options

FPGA in Master Mode


Din Dout

FPGA in Slave Mode


Din Dout

FPGA in Slave Mode


Din Dout

Slave FPGA configured by external source (i.e., a P)


Serial or Parallel options Used for dynamic reconfiguration Can also read configuration memory contents

Boundary Scan Interface


4-wire IEEE standard serial interface for testing Write and read access to configuration memory
Not available in all FPGAs Used for dynamic partial reconfiguration

Interfaces to FPGA core


Not available in all FPGAs Connections between Boundary Scan Interface and internal routing network and PLBs (Xilinx provides 22-4 of these ports)

Other configuration interfaces in some FPGAs


C. Stroud 9/07 FPGAs 35

Xilinx Configuration Interface Pins

C. Stroud 8/06

FPGAs

36

Spartan 3 Master Modes

C. Stroud 8/06

FPGAs

37

Master mode
Configuration sequence during powerpower -up of device
Typically from Serial EPROM Master Serial Parallel EPROM Master Parallel 8-bit 3232 -bit

C. Stroud 8/06

FPGAs

38

Spartan 3 Slave Configuration

C. Stroud 8/06

FPGAs

39

Spartan 3 Daisy Chains

C. Stroud 8/06

FPGAs

40

Configuration Techniques
Full configuration & readback
Simple configuration interface Internal automatic calculation of frame address Long download time for large FPGAs

Partial reconfiguration & readback


Only change portions of configuration memory with respect to reference design Reduces download time for reconfiguration Requires more complicated interface Command Register (CMR) Frame Length Register (FLR) Frame Address Register (FAR) Frame Data Register
Input (FDRI) for download Output (FDRO) for readback (note separate access)
C. Stroud 9/07 FPGAs 41

Configuration Techniques
Compressed configuration
Requires multiple frame write capability
Write identical frames of config data to multiple frame addresses

Extension of partial reconfiguration interface capabilities


Frame address is much smaller than frame of configuration data

Reduces download time for initial configuration depending on


Regularity of system function design % utilization of array Unused portions written with default configuration data

C. Stroud 9/07

FPGAs

42

Full Configuration Example


Dummy Word 0xFFFFFFFF Synchronize Word 0xAA995566 CMD Write 0x30008001
Reset CRC 0x00000007

FLR Write 0x30016001


FLR = 0x00000024 Frame length = 37 words
1,184 bits 32 bits/word

COR Write 0x30012001


COR Write 0x00003FE5

IDCODE Write 0x3001C001


Device ID = 0x0140D093 (3S50)

MASK Write 0x3000C001


MASK = 0x00000000

CMD Write 0x30008001


Switch CCLK 0x00000009

FAR Write 0x30002001


FAR = 0x00000000 (full config)

CMD Write 0x30008001


Write CFG 0x00000001

FDRI Write 0x30004000


C. Stroud 9/07

# words to write 0x50003555

FPGAs

Xilinx ASCII Bitstream Created by Bitstream I.32 Design name: s3mod7.ncd Architecture: spartan3 Part: 3s50tq144 Date: Tue Sep 04 15:50:09 2007 Bitsstart of actual configuration data

Partial Reconfiguration Example


Dummy Word 0xFFFFFFFF Synchronization Word 0xAA995566 CMD Write 0x30008001
Reset CRC 0x00000007
Bits: 26656 11111111111111111111111111111111 10101010100110010101010101100110 00110000000000001000000000000001 00000000000000000000000000000111 00110000000000011100000000000001 00000001010000001101000010010011 00110000000000010010000000000001 01000000000000000011111111100101 00110000000000001000000000000001 00000000000000000000000000001011 00110000000000000000000000000001 00000000000000000010110011101001 4 NOOPs 0x20000000 00110000000000001000000000000001 00000000000000000000000000001000 00110000000000001000000000000001 00000000000000000000000000000001 00110000000000000010000000000001 00000000000010000000000000000000 00110000000000011110000000000001 00000000000000000000000000000000 16 NOOPs 0x20000000 00110000000000000100001011100100 0011000000000000010 0001011100100 start of actual configuration data 44

IDCODE Write 0x3001C001


Device ID = 0x0140D093 (3S50)

COR Write 0x30012001


COR Write Packet Data 0x00003FE5

CMD Write 0x30008001


Shutdown 0x0000000B

CRC Write 0x30000001


CRC = 0x00002CE9

CMD Write 0x30008001


AGhigh 0x00000008

CMD Write 0x30008001


WCFG 0x00000001

FAR Write 0x30002001


FAR = 0x00080000 (partial config)

Part Reconfig Reg Write 0x3001E001


Null 0x00000000

FDRI Write 0x300042E4


#words to write 0x000002E4
C. Stroud 9/07 FPGAs