You are on page 1of 40

PCI Express Basics

Ravi Budruk
Senior Staff Engineer and Partner
MindShare, Inc.

Copyright 2007, PCI-SIG, All Rights Reserved

PCI Express Introduction


PCI Express architecture is a high performance, IO
interconnect for peripherals in
computing/communication platforms
Evolved from PCI and PCI-X architectures
9 Yet PCI Express architecture is significantly different from its
predecessors PCI and PCI-X

PCI Express is a serial point-to-point interconnect


between two devices
Implements packet based protocol for information
transfer
Scalable performance based on number of signal
Lanes implemented on the PCI Express interconnect

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

PCI Express Terminology


PCI Express Device A

Signal

Link

Wire

Lane

PCI Express Device B


PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

PCI Express Throughput


Link Width

x1

x2

x4

x8

x12

x16

x32

Aggregate
BW
(GBytes/s)

0.5

16

Assumes 2.5 GT/s signaling in each direction


80% BW utilized due to 8b/10b encoding overhead
Aggregate bandwidth implies simultaneous traffic in both directions
Peak bandwidth is higher than any bus available

PCIe 2.0 PHYs may optionally support 5 GT/s signaling, thus


doubling above bandwidth numbers

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

PCI Express Features

Point-to-point connection
Serial bus means fewer pins
Scaleable: x1, x2, x4, x8, x12, x16, x32
Dual Simplex connection
2.5VGT/s transfer/direction/s
Packet based transaction protocol
PCIe
Device
A

Packet

Link (x1, x2, x4, x8, x12, x16 or x32)

PCIe
Device
B

Packet
PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

Differential Signaling
Electrical characteristics of PCI Express signal
9 Differential signaling
Transmitter Differential Peak voltage = 0.4 - 0.6 V
Transmitter Common mode voltage = 0 - 3.6 V
DVcm

V Diffp
D+

Two devices at opposite ends of a Link may support


different DC common mode voltages

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

Example PCI Express


Topology
CPU

Root Complex
PCIe

PCIe

PCIe
PCIe
Switch
Endpoint
PCIe

PCIe
Endpoint

PCIe

PCIe
Endpoint
PCIe

Legacy
Endpoint

Memory

PCIe
Bridge To
PCI/PCI-X

PCI/PCI-X

Legend
PCI Express Device Downstream Port
PCI Express Device Upstream Port
PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

Example PCI Express


Topology Root & Switch
CPU Bus

CPU
Root
RCRB

Bus 0

Root Complex
PCIe

PCIe

PCIe
PCIe
Switch
Endpoint
PCIe

PCIe
Endpoint

Virtual
PCI
Bridge

Virtual
PCI
Bridge

PCIe

PCI Express Links

PCIe
Bridge To
PCI/PCI-X

Switch
PCI/PCI-X

Legend
PCI Express Device Downstream Port
PCI Express Device Upstream Port
PCI-SIG Developers Conference

Virtual
PCI
Bridge

PCIe

PCIe
Endpoint

Legacy
Endpoint

Memory

Copyright 2007, PCI-SIG, All Rights Reserved

Virtual
PCI
Bridge

Virtual
PCI
Bridge

Virtual
PCI
Bridge

Virtual
PCI
Bridge

Example Low Cost


PCI Express System
Processor
FSB
PCI Express
GFX
GFX

HDD

Root Complex

PCI Express

Serial ATA

IO Controller Hub
(ICH)

LPC

PCI-SIG Developers Conference

IEEE
1394
Slot

S
IO

PCI Express
Link

Slots

PCI

USB 2.0

COM1
COM2

DDR
SDRAM

GB
Ethernet

Add-In Add-In Add-In


PCI Express

Copyright 2007, PCI-SIG, All Rights Reserved

Example PCI Express


Server System
Processor

Processor
FSB

PCI Express
GFX
GFX

Root Complex

DDR
SDRAM

Endpoint
InfiniBand
Switch
Out-of-Box

InfiniBand

Switch

10Gb
Ethernet

Fiber
Channel

Switch

Endpoint

Endpoint

RAID Disk array


Add-In

Switch

10Gb
Ethernet
Endpoint

PCI Express
Link

SCSI
Endpoint
PCI

Add-In

Gb
Ethernet
Endpoint

PCI-SIG Developers Conference

PCI Express
to-PCI
Endpoint

S
IO
COM1
COM2

Copyright 2007, PCI-SIG, All Rights Reserved

IEEE
1394

Slots

10

Transaction Types,
Address Spaces

Request are translated to one of four transaction types by


the Transaction Layer:
1.

2.

3.

4.

Memory Read or Memory Write. Used to transfer data from or to a


memory mapped location
The protocol also supports a locked memory read transaction variant.
I/O Read or I/O Write. Used to transfer data from or to an I/O location
These transactions are restricted to supporting legacy endpoint
devices.
Configuration Read or Configuration Write. Used to discover device
capabilities, program features, and check status in the 4KB PCI Express
configuration space.
Messages. Handled like posted writes. Used for event signaling and
general purpose messaging.

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

11

PCI Express TLP Types


Description

Abbreviated
Name

Memory Read Request

MRd

Memory Read Request Locked Access

MRdLk

Memory Write Request

MWr

IO Read Request

IORd

IO Write Request

IOWr

Configuration Read Request Type 0 and Type 1

CfgRd0, CfgRd1

Configuration Write Request Type 0 and Type 1

CfgWr0, CfgWr1

Message Request without Data Payload

Msg

Message Request with Data Payload

MsgD

Completion without Data (used for IO, configuration write completions and read
completion with error completion status)

Cpl

Completion with Data (used for memory, IO and configuration read completions)

CplD

Completion for Locked Memory Read without Data (used for error status)

CplLk

Completion for Locked Memory Read with Data

CplDLk

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

12

Three Methods For


Packet Routing
Each request or completion header is tagged as to its type, and
each of the packet types is routed based on one of three
schemes:
9 Address Routing
9 ID Routing
9 Implicit Routing

Memory and IO requests use address routing.


Completions and Configuration cycles use ID routing.
Message requests have selectable routing based on a 3-bit
code in the message routing sub-field of the header type field.

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

13

Programmed I/O Transaction


Processor

Processor
FSB

MRd
Requester:
-Step 1: Root Complex (requester)
initiates Memory Read Request (MRd)
-Step 4: Root Complex receives CplD

Root Complex

DDR
SDRAM

CplD

MRd
Switch A

Switch C

MRd
CplD
Switch B

Endpoint

PCI-SIG Developers Conference

Endpoint

CplD

MRd
Endpoint

Endpoint

Endpoint

Completer:
-Step 2: Endpoint (completer)
receives MRd
-Step 3: Endpoint returns
Completion with data (CplD)

Copyright 2007, PCI-SIG, All Rights Reserved

14

DMA Transaction
Processor

Processor
FSB

Completer:
-Step 2: Root Complex (completer)
receives MRd
-Step 3: Root Complex returns
Completion with data (CplD)
CplD

Root Complex

DDR
SDRAM

MRd
Switch A

Switch C

CplD
MRd
Switch B

Endpoint

PCI-SIG Developers Conference

Endpoint

MRd

CplD
Endpoint

Endpoint

Endpoint

Requester:
-Step 1: Endpoint (requester)
initiates Memory Read Request (MRd)
-Step 4: Endpoint receives CplD

Copyright 2007, PCI-SIG, All Rights Reserved

15

Peer-to-Peer Transaction
Processor

Processor
FSB

Root Complex
MRd

CplD
Switch A

DDR
SDRAM
CplD

MRd
Switch C

CplD

Switch B

Endpoint

PCI-SIG Developers Conference

Endpoint

Endpoint

MRd

CplD
Endpoint

CplD

MRd

MRd

Endpoint

Completer:
-Step 2: Endpoint (completer)
receives MRd
-Step 3: Endpoint returns
Completion with data (CplD)

Requester:
-Step 1: Endpoint (requester)
initiates Memory Read Request (MRd)
-Step 4: Endpoint receives CplD
Copyright 2007, PCI-SIG, All Rights Reserved

16

PCI Express Device Layers


PCI Express Device A

PCI Express Device B

Device Core

Device Core

PCI Express Core

PCI Express Core

Logic Interface

Logic Interface

TX

RX

TX

RX

Transaction Layer

Transaction Layer

Data Link Layer

Data Link Layer

Physical Layer

Physical Layer
Link

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

17

TLP Origin and Destination


PCI Express Device A

PCI Express Device B

Device Core

Device Core

PCI Express Core

PCI Express Core

Logic Interface

Logic Interface

TX
TLP

RX

TX

Transaction Layer

RX

Transaction Layer

TLP
Received

Transmitted

Data Link Layer

Data Link Layer

Physical Layer

Physical Layer
Link

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

18

TLP Structure
Information in core section of TLP comes
from Software Layer / Device Core
Bit transmit direction

Start Sequence Header


1B

2B

3-4 DW

Data Payload
0-1024 DW

ECRC LCRC End


1DW

1DW

1B

Created by Transaction Layer

Appended by Data Link Layer

Appended by Physical Layer


PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

19

DLLP Origin and Destination


PCI Express Device A

PCI Express Device B

Device Core

Device Core

PCI Express Core

PCI Express Core

Logic Interface

Logic Interface

TX

DLLP

RX

TX

RX

Transaction Layer

Transaction Layer

Data Link Layer

Data Link Layer

Transmitted

DLLP
Received

Physical Layer

Physical Layer
Link

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

20

DLLP Structure
Bit transmit direction

Start

DLLP

CRC

End

1B

4B

2B

1B

Data Link Layer

Appended by Physical Layer

PCI-SIG Developers Conference

ACK / NAK Packets


Flow Control Packets
Power Management Packets
Vendor Defined Packets

Copyright 2007, PCI-SIG, All Rights Reserved

21

Ordered-Set Origin and Destination


PCI Express Device A

PCI Express Device B

Device Core

Device Core

PCI Express Core

PCI Express Core

Logic Interface

Logic Interface

TX

Ordered-Set
Transmitted

RX

TX

RX

Transaction Layer

Transaction Layer

Data Link Layer

Data Link Layer

Physical Layer

Physical Layer

Ordered-Set
Received

Link

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

22

Ordered-Set Structure
COM

Identifier Identifier

Identifier

Training Sequence One (TS1)


9 16 character set: 1 COM, 15 TS1 data characters

Training Sequence Two (TS2)


9 16 character set: 1 COM, 15 TS2 data characters

SKIP
9 4 character set: 1 COM followed by 3 SKP identifiers

Electrical Idle (IDLE)


9 4 characters: 1 COM followed by 3 IDL identifiers

Fast Training Sequence (FTS)


9 4 characters: 1 COM followed by 3 FTS identifiers
PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

23

Quality of Service
Processor
PCI Express
GFX
GFX

Processor

Root Complex

DDR
SDRAM

Endpoint
InfiniBand
Switch
Out-of-Box

InfiniBand

Switch

10Gb
Ethernet

Fiber
Channel

Switch

Endpoint

Endpoint

RAID Disk array


Add-In

Switch

10Gb
Ethernet

PCI Express
to-PCI

Endpoint

Endpoint

SCSI
Endpoint

Slot

PCI Express
Link
PCI-SIG Developers Conference

Video
Camera

PCI
SCSI
Endpoint

S
IO

IEEE
1394

Slots

COM1
COM2

Copyright 2007, PCI-SIG, All Rights Reserved

24

Traffic Classes and Virtual Channels


Quality of Service (QoS) policy through Virtual
Channel and Traffic Class tags

Receiver Device B

TC[2:0]
maps to
VC0 VC0

TC[7:3]
maps to VC1
VC1

VC0

VC1

TC/VC Mapping

Buffers

Link
Arbitration

TC[7:0]

TC/VC Mapping

Transmitter Device A

VC0

Buffers TC[7:0]
VC1

One physical Link,


multiple virtual paths

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

25

Port Arbitration
and VC Arbitration
Link

TC[7:3] to VC1

VC0
VC1

Link
TC[2:0] to VC0

VC0

TC[7:3] to VC1

VC1

PCI-SIG Developers Conference

g
in 0
p
t
ap or
M sP
C s
/V gre
C
T rE
fo

T
f C
o
r E /VC
g M
r
es ap
s pi
P n
o
rt g
0

TC[2:0] to VC0

Link
VC0
Port
Arb
VC1
Port
Arb

VC0
VC
Arb
VC1

VC0
0

VC1

Copyright 2007, PCI-SIG, All Rights Reserved

26

PCI Express Flow Control


Credit-based flow control is point-to-point
based, not end-to-end
Buffer space
available
TLP

Transmitter

VC Buffer

Receiver

Flow Control DLLP (FCx)


Receiver sends Flow Control Packets (FCP) which are a type of DLLP (Data Link Layer Packet)
to provide the transmitter with credits so that it can transmit packets to the receiver

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

27

ACK/NAK Protocol Overview


Transmit

Receiver

Device A

Device B

From
Transaction Layer
Tx

To
Transaction Layer
Rx

Data Link Layer


TLP
Sequence
TLP

LCRC

Replay
Buffer

Data Link Layer


DLLP

DLLP

ACK /
NAK

ACK /
NAK

TLP
Sequence
TLP

De-mux

De-mux

Mux

Tx

LCRC

Mux

Rx

Tx

Error
Check

Rx

DLLP
ACK /
NAK

Link
TLP
Sequence
TLP
PCI-SIG Developers Conference

LCRC

Copyright 2007, PCI-SIG, All Rights Reserved

28

ACK/NAK Protocol: Point-to-Point


1a. Request

2a. Request

4b. ACK

3b. ACK
Switch

Requester

Completer

1b. ACK

2b. ACK

4a. Completion

3a. Completion

ACK returned for good reception of Request or Completion


NAK returned for error reception of Request or Completion

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

29

Interrupt Model: Three Methods

PCI Express supports three interrupt reporting


mechanisms:
1.

2.

Message Signaled Interrupts (MSI)

Legacy endpoints are required to support MSI (or MSI-X) with 32- or
64-bit MSI capability register implementation

Native PCI Express endpoints are required to support MSI with 64-bit MSI
capability register implementation

Message Signaled Interrupts - X (MSI-X)

3.

Legacy and native endpoints are required to support MSI-X (or MSI)
and implement the associated MSI-X capability register

INTx Emulation.

Native and Legacy endpoints are required to support Legacy INTx


Emulation
PCI Express defines in-band messages which emulate the four
physical interrupt signals (INTA-INTD) routed between PCI devices
and the system interrupt controller
Forwarding support required by switches

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

30

Native and Legacy Interrupts

PCIe -

PCIe

PCI-SIG Developers Conference

PCIe

Copyright 2007, PCI-SIG, All Rights Reserved

31

PCI Express Error Handling


All PCI Express devices are required to support some
combination of:
9 Existing software written for generic PCI error handling, and which
takes advantage of the fact that PCI Express has mapped many of
its error conditions to existing PCI error handling mechanisms.
9 Additional PCI Express-specific reporting mechanisms

Errors are classified as correctable and uncorrectable.


Uncorrectable errors are further divided into:
9 Fatal uncorrectable errors
9 Non-fatal uncorrectable errors.

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

32

Correctable Errors
Errors classified as correctable, degrade system
performance, but recovery can occur with no loss of
information
9 Hardware is responsible for recovery from a correctable error and
no software intervention is required.

Even though hardware handles the correction, logging the


frequency of correctable errors may be useful if software is
monitoring link operations.
An example of a correctable error is the detection of a link
CRC (LCRC) error when a TLP is sent, resulting in a Data
Link Layer retry event.

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

33

Uncorrectable Errors

Errors classified as uncorrectable impair the functionality


of the interface and there is no specification mechanism
to correct these errors
The two subgroups are fatal and non-fatal
1.

Fatal Uncorrectable Errors: Errors which render the link


unreliable

2.

First-level strategy for recovery may involve a link reset by the


system
Handling of fatal errors is platform-specific

Non-Fatal Uncorrectable Errors: Uncorrectable errors


associated with a particular transaction, while the link itself is
reliable

Software may limit recovery strategy to the device(s) involved


Transactions between other devices are not affected

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

34

Baseline Error Reporting

Enabling/disabling error reporting


Providing error status
Providing error status for Link Training
Initiating Link Re-training

Registers provide control and status for


9 Correctable errors
9 Non-fatal uncorrectable errors
9 Fatal uncorrectable errors
9 Unsupported request errors
PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

35

Advanced Error Reporting


Finer granularity in defining error type
Ability to define severity of uncorrectable errors
9 Either send ERR_FATAL or ERR_NONFATAL
message for a given error

Support for error logging of error type and TLP


header related to error
Ability to mask reporting of errors
Enable/disable root reporting of errors
Identify source of errors

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

36

PCI Express
Configuration Space

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

37

Summary of Changes for 2.0


Higher speed (5.0 GT/s), supported by:
9 Selectable de-emphasis levels
9 Selectable transmitter voltage range

Dynamic speed and link width changes


9 Power savings, higher bandwidth, reliability

Virtualization support
9 Access Control Services

Other New Features


9 Completion timeout control
9 Function Level Reset
9 Modified Compliance Pattern for testing

PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

38

Thank you for attending the


PCI-SIG Developers Conference 2007.

For more information please go to


www.pcisig.com
PCI-SIG Developers Conference

Copyright 2007, PCI-SIG, All Rights Reserved

39

PCI Express Basics


Ravi Budruk
Senior Staff Engineer and Partner
MindShare, Inc.

Copyright 2007, PCI-SIG, All Rights Reserved

40

You might also like