You are on page 1of 64

PCIe

Transaction Layer
Outline
PCIe Basic
◦ Topology
◦ Configuration Header
◦ Enumeration

Transaction Layer
◦ Transaction Layer Packet(TLP)
◦ TLP Header
◦ TLP Type
◦ Flow control
◦ Virtual channel / Traffic class
◦ Ordering
PCIe Basic
◦ Topology
◦ Configuration space
◦ Enumeration
Topology
PCIe interfaces connected by a Link.
◦ Link : A point-to-point connection. Only two interfaces
can be connected on link, and no loop.

Component:
◦ Root Complex
◦ Switch
◦ Bridge
◦ EndPoint
Root Complex
Interface between CPU/PCIE bus/Memory

RC acts on behalf of the CPU to communicate


with the rest of the system.
Switch/Bridge
Switches allow more devices to be attached to a
single PCIe Port.

PCIe-PCI Bridges provide an interface to other


buses, such as PCI or PCI-X.
Endpoint
Endpoints act as initiators and Completers of
transactions on the bus.

Requester/Completer
◦ Requester : initiates requests
◦ Completer: Services requests
Configuration Header-1
There are registers in devices or bridges
that stores information or status of
devices.

The configuration space are called


Header.
◦ Type 0 : EP
◦ Type 1 : switch bridge
Configuration Header-2
Configuration software allocates memory
space for each enumerated devices

Software can acts with them by accessing


the memory location.

Each port(upstream/downstream) has


configuration header.
Configuration Header-3
• PCIe extended the reserved size of the memory
space called “Extended Configuration Space “ to
4K Bytes. (Space for PCI is 256 Bytes)

• Based on maximum of BDF, PCIe will costs


256MB memory space as a maximum.
• 4K*256(Bus)* 32(Device)*8(Function)
Enumeration
Enumeration SW searches the hierarchy for EPs, switch
bridges and gives them ID(BDF).

Bus number (Maxumum:256)


Device number(Maxumum:32)
Function number((Maxumum:8)

◦ Pri = Primary Bus Number


◦ Sec = Secondary Bus Number
◦ Sub = Subordinate Bus Number
Transaction Layer
Transaction Layer
◦ Transaction Layer Packet(TLP)
◦ TLP Header
◦ TLP Type
◦ TLP Routing
◦ Flow control
◦ Virtual channel / Traffic class
◦ Ordering
Layering Overview
Transaction Layer
◦ In response to requests from the Software Layer, generates
outbound packets.

Data Link Layer


◦ Is responsible for Link management and performs three
major functions:
◦ TLP error correction
◦ flow control
◦ Link power management.

Physical Layer
◦ The spec divides the Physical Layer discussion into two
portions:
◦ logical part : 8b/10 encode, scrambling, serializing…etc.
◦ electrical part : Driving differential signal
Transaction Layer Packet(TLP)-1
Transaction Layer Packet(TLP)-2
Types of requests
◦ Indicates the types of requests from requester.
ex. An endpoint wants to write memory, raises
and memory write request.

Routing
◦ Indicates the target of requests. Includes where the TLP
should be delivered.

Ordering
◦ When multiple requests reached a switch. Decide witch
one should pass first.
TLP Header
Format (Fmt)
Type
Traffic Class(TC)
2 or 3DW Could be changed
Attribute(Attr)
Lightweight Notification(LN)
TLP Hint(TH)
TLP Digest(TD)
Poisoned Data(EP)
Address Type(AT)
Length
TLP Header – Format & Type
Fmt & Type field represents the basic of this TLP.
TLP Header has two types, 3DW, 4DW or w/ prefix.

Fmt[2:0]: T T L
◦ Fmt[2] : If set, TLP w/ prefix. 9 8 N

◦ Fmt[1] : If set, TLP is 4DW, or 3DW.


◦ Fmt[0] : If set, TLP is with data payload.

Type[5:0]
◦ Field is encoded for type of TLP from TLP initiator. Ex. Read memory, write
configuration etc.
TLP Types-1
MRd
MWr
Memory
TLP types can be sorted roughly by 5 categories MrdLk
◦ IO Read/Write AtomicOps
◦ Read/Write data from/to an Legacy EP.
IORd
◦ Memory Read/Write Read/Write IO
◦ Read/Write data from/to main memory.
IOWr
CfgRd0
◦ Configuration Read/Write Type0
◦ Read/Write configuration register of Eps. CfgWr0
◦ Type0 for EPs, Type1 for bridge Configuration
CfgRd1
Type1
◦ Message CfgWr1
◦ RC uses Message TLP to control or read status of EP/Switch.
◦ This TLP type takes place of sideband signals of Legacy bus.
Msg
Message
MsgD
◦ Completion
◦ Indicate the TLP is serving the requester’s TLP. Cpl
Completion
CplD
TLP Types-2
4DW or 3DW

With data?

AtomicOPs

What’s the message


Posted & Non-Posted Requests
Requests can be separated by Posted and Non-posted.
Request Type
Posted requests Memory Write Posted
◦ The request don’t need a response(completion).
Message Posted
◦ Memory Write, Message request.
Memory Read
Non-posted
Memory Read Lock
Non-posted requests AtmoicOps Non-posted
◦ The request need a response(completion).
◦ IO Read/Write, Memory Read, Configuration Read/Write. IO Read
Non-posted
IO Write
Configuration Read
Non-posted
Configuration Write
TLP Routing
Address routing : The destination of TLP is targeted by address.
◦ Memory request
◦ IO request
◦ Message

ID routing : The destination of TLP is targeted by ID.


◦ Configuration request
◦ Completion
◦ Message

Implicit routing
◦ Message
Address Routing
Address routing used for
◦ IO
◦ Memory

Address should be size of 32 bits or


64 bits( over 4GB)
ID Routing
ID routing used for
◦ Configuration
◦ Completion

RC/Switch transmit TLP to a


proper target by the BDF
Implicit routing
Implicit routing used for
◦ Message

Message routing subfield Type[2:0]


◦ 000b : Route to RC
◦ 001b : Route by address
◦ 010b : Route by ID
◦ 011b : Broadcast downstream
◦ 100b : terminate at receiver
◦ 101b : Gather & route to RC
TLP Header – other field
Length : Payload size (unit DW)
Attr[2:1] : Related to ordering
TC [2:0] : such like priority, larger means
higher priority.
TD : If set, TLP has ECRC.
EP : If set, TLP is poisoned. T T L
9 8 N
AT : Address type
LN :Lightweight Notification
T8,T9 : Tag’s extension bits
Attr[0] : No snoop
TH : TLP process hint.
Traffic Class
During initialization, device driver communicates
software, decided TC values to use for each type of
packet.

The TC value defaults to zero so packets that don’t


need priority service won’t accidentally interfere
with those that do.

Traffic Classes that define eight priorities specified


by a 3-bit TC field within each TLP header (with
ascending priority; TC 0-7).
TLP Hint (TH)
Adding hints about how the system should handle TLPs targeting memory space can improve latency and
traffic congestion.

With TH set
Attribute Field
Attr[2] : ID-Based Ordering
Attr[1] : Relaxed Ordering
Attr[0] : No Snoop
A
T T L AR
R
9 8
r N rr

No Snoop : The memory transaction TAG


doesn’t need to be updated to catches.
Lightweight Notification(1/2)
LN protocol provides a notification service for when
cacheline of interest are updated.

LN Requester (LNR) : a client subsystem in an


Endpoint that sends LN Read/Write Requests and
receives LN Messages.
T T L
9 8 N

LN Completer (LNC) : a service subsystem in the host TAG


that receives LN Read/Write Requests, and sends LN
Messages when registered cachelines are updated.
Lightweight Notification(2/2)
LN Read Example
1. an LNR sends an LN Read to a Memory Space
range that has an associated LNC
2. Requesting a copy of a cacheline.
3. The LNC returns the requested line to the LNR
and records that the LNR has requested
notification when that line is updated.
4. Later, the LNC notifies the LNR via an LN
Message when some entity updates the line, so
the LNR can take appropriate action.
TLP Header – Length
One TLP can transmit 4KByte as a
maximum.

00 0000 0000b represents 1024DW. T


9
T
8

TAG
To represents a no data payload TLP,
length field need to cooperate with DW
BE field.
TLP Header – DW BE
DW Byte Enable is 8-bit field.

Because PCIe bus accessing memory is DW-


aligned, BE indicates which bytes are valid
for the head and tail of data stream.

Ex. Byte 0 and Byte 1 of First DW are not


accessed, 1st DW BE is 1100b. 00b means
Byte 0 and Byte 1 not be accessed.
Address Type (AT)
Address Type (AT) field is used to indicate the type of address
that is present in the request header.

00b : Address is untranslated


T T
01b : Address need to be translated into physical address. 9 8

TAG
10b : Address is translated into physical address.
Tag
Tag generated by Requester, and it must be
unique for all outstanding Requests that require a
Completion for that Requester.

T T L
Tag and Requester ID consist Transaction ID. 9 8 N

TAG

Transaction ID
IO request
IO Requests is made for Legacy devices.
TLP type filed 00010b = IO request.
Fmt[2] indicates the TLP if w/ data.
IO request is always 3DW.
IO request’s TC is always 000b.
Length for IO request always 1DW.
Last DW BE must be all 0.
Memory Request
Type can be:
◦ 00000b : Memory Read/Write
◦ 00001b : Memory Read Locked

Length indicates the data size of this


transfer.
◦ 10’h1 = 1DW
◦ 10’h2 = 2DW
◦ 10’h3ff = 1023DW
◦ 10’h0 = 1024DW(4KB)

The address is DW-aligned.


Configuration Requests
Only RC can initiate Configuration Request.
Configuration Request is routed by ID routing.
Bridge transfer Type 1 TLP to Type 0 TLP if it
reaches the bottom.
TC must be 000b.
Tag only used 4:0 (32 outstanding transaction).
But if Extended Tag bit is set. It supports 256.
Ext Reg Number & Register Number field:
Used for accessing configuration space.
Completions
Completion responds to non-posted Request and a 3 DW TLP.
Completion copies attributes of request and appends to
Completion’s header.
◦ Requester ID
◦ Tag
◦ TC
◦ Attribute bits

Completion status defines 4 status of completion


◦ 000b : Successful Completion(SC)
◦ 001b : Unsupported Request (UR)
◦ 010b : Configuration Request Retry Status(CRS)
◦ 100b : Completer abort(CA)

Byte Count : Remaining to satisfy a read request.


Message Requests
Message Request is used to replace sideband
signals in PCI.

All Message Requests uses 4DW header.

Message routing subfield Type[2:0]


◦ 000b : Route to RC
◦ 001b : Route by address
◦ 010b : Route by ID
◦ 011b : Broadcast downstream
◦ 100b : terminate at receiver
◦ 101b : Gather & route to RC
Message Code
This spec defines the following groups of Messages:
◦ INTx Interrupt Signaling
◦ Power Management
◦ Error Signaling
◦ Locked Transaction Support
◦ Slot Power Limit Support
◦ Vendor-Defined Messages
◦ Latency Tolerance Reporting (LTR) Messages
◦ Optimized Buffer Flush/Fill (OBFF) Messages
◦ Device Readiness Status (DRS) Messages
◦ Function Readiness Status (FRS) Messages
◦ Precision Time Measurement (PTM) Messages
Flow Control-1
Virtual Channels are hardware buffers that act
as queues for outgoing packets.

Flow Control check that the another side of


the link’s buffer is able to accept the TLP.

Flow Control mechanisms can improve


transmission efficiency if multiple Virtual
Channels (VCs) are used.
Flow Control-2
Header
Each VC Flow Control buffer at the receiver is managed
for each category.
◦ There are 6 types of buffer for each VC. Data

Three categories :
◦ Posted Transactions
◦ Non-Posted Transactions
◦ Completions

Credit is the unit for VC.


◦ Different types TLP, different size of credit.
◦ Ex. 1 unit for posted request header is 5DW, but for completion
header is 4DW
Minimum Flow Control Flow Control
Posted Request header(PH):
◦ 1 unit ,4DW HDR + Digest =5DW

Posted Request data(PD)


◦ Max_Payload_Size /16 bytes(credit)
◦ Ex. 1024byte/16, 64 unit

Non-Posted Request header(NPH)


◦ 1 unit ,4DW HDR + Digest =5DW

Non-Posted Request data(NPD)


◦ 1 unit. Credit Value = 4DW

Completion HDR (CPLH)


◦ 1 unit. Credit Value = 4DW

Completion Data (CPLD)


◦ Max_Payload_Size /16 bytes(credit)
Flow Control-3
Flow Control is a function of the Transaction Layer
and in charge between Transaction and Link Layer.
◦ Link and Physical layer should process DLLP.

Credit Space info


Flow Control use DLLP(Data Link Layer Packet) to
communicates with another side. And DLLP which is
sent by receiver includes buffer space info.

Responsibility
◦ Devices Report Available Buffer Space
◦ Receivers Register Credits
◦ Transmitters Check Credits
Data Link Layer Packet(DLLP)
Byte0[5:4]
◦ 00b : Posted
◦ 01b : Non-posted
◦ 10b : Completion

VC ID
◦ Indicates the VC will be updated

HdrFC field
◦ It’s 8-bit field and support 127 unit as a maximum.

DataFC field
◦ It’s 12-bit field and support 2047 unit as a maximum.
Flow Control-4
Transmitter Elements
◦ Transactions Pending Buffer
◦ Credits Consumed counter
◦ Credit Limit counter
◦ Flow Control Gating Logic

Receiver Elements
◦ Flow Control Buffer
◦ Credit Allocated
◦ Credits Received counter
Counters Roll Over
Virtual Channel
VCs are hardware buffers that act as queues for
outgoing packets.

Each port must include the default VC0, but may


have as many as eight (from VC0 to VC7).

The higher index one got the higher priority.

VCs configuration registers called the Virtual


Channel Capability Block.
Virtual Channel Capability Block
What information includes?
◦ VC count
◦ VC ID
◦ TC/VC Mapping
◦ VC Arbitration Capability
◦ Port Arbitration Capability
◦ Arbitration table
TC/VC Mapping
Configuration software set the TC/VC Map during
initialization.

Configuration software assigns an ID.

Configuration software determines the Number of VCs to


be Used.

Rules regarding the TC/VC mapping:


◦ TC0 will automatically be mapped to VC0. Other TCs may be mapped
to any VC. VC0 which is always hardwired.
◦ A TC may not be mapped to more than one VC.
VC Arbitration
VC arbitration determines the order of packet
transmission based on TC number.

Software can choose arbitration policy provided by


hardware.

VC capability registers provide three basic VC arbitration


◦ Strict Priority Arbitration
◦ Group Arbitration
◦ Hardware fixed Arbitration
Strict Priority VC Arbitration
The default priority scheme is based on the inherent
priority of VC IDs(VC0=lowest priority and VC7=highest
priority).

Strict priority arbitration enables minimal latency for


high-priority transactions.

The mechanism is automatic and requires no


configuration.

Strict priority has the potential to starve low-priority


channels for bandwidth.
Group VC Arbitration-1
Port VC Capability Register 1 can select the boundary
to separate Low-Priority and High-Priority.

High-Priority applies Strict Priority, and Low-Priority


can choose priority scheme by software.
Group VC Arbitration-2
Selection for Low-Priority Arbitration Scheme

◦ Hardware Fixed : a hardware-based method and requires no


additional software setup.

◦ Weighted Round Robin : Software loads a table that to the


register field. And VC entry will repeatedly scan all table
entries in a sequential fashion and send packets from the VC
specified in the table entries.
Group VC Arbitration-3
WRR supports different number of phases.
Port Arbitration
For Switch ports and root ports, Packets from
multiple ports can all target the same VC in the
same outgoing port, arbitration is needed to
access to that VC.

Port arbitration will usually need software


configuration for each virtual channel supported.
Port Arbitration Policy
Software can set up the port arbitration table that
table will be scanned, each phase specifies the
port number from which the next packet is
received.

WRR Arbitration Mechanisms


◦ Access ports according to the PAT(Port Arbitration
Table)
◦ If the scanned port has no transaction, this port will
be pended and scan the next phase immediately.
Time-Based, Weighted Round Robin
(TBWRR)
This mechanism is required for isochronous support.

Rather than immediately advancing to the next phase, the


time-based arbiter waits until the current virtual timeslot
elapses before advancing.

This ensures that transactions are accepted from the


ingress port buffer at regular intervals.

The length of the timeslot currently has the value of 100ns.


Port Arbitration

VC0 Port Arb.

VC1 Port Arb.


Transaction Ordering-1
PCI Express ordering rules apply to transactions of the
same Traffic Class (TC).

Different TCs have no ordering requirement(unrelated).

Ordering relationships defined by the PCIe spec are


based on TLP type. TLPs are divided into three
categories:
◦ Posted
◦ Completion
◦ Non-Posted
Transaction Ordering-2
If TLP2 is sent with proper ordering setting,
TLP2 can be sent and don’t need to wait
for TLP1 finished.
Relaxed Ordering
Transactions are required to remain order while they go through buffers in bridges.
RO allows switches to reorder transactions to improve performance.
RO attribute bit set(Attr[1]), indicating that software verifies it to be unrelated to other transactions, and that
allows it to be re-ordered ahead of other transactions.

Attr[2] : ID-Based Ordering


Attr[1] : Relaxed Ordering
ID Ordering
Transaction from different EPs, there is no relationship
between them.

Software can enable the use of IDO by setting its


Device Control 2 Register.
Relaxed and ID ordering are applied within a same VC.
Ordering Rules Table

PCIe-PCI bridge must pass to prevent dead lock

Same Transaction ID not allowed to pass.

You might also like