Professional Documents
Culture Documents
Compute Express Link™ and CXL™ Consortium are trademarks of the Compute Express Link Consortium.
Usage Models
Usage Models
Form Factors
Ex #1: EDSFF E1.S Ex #2: EDSFF E3.S / E3.L Ex #3: Add-in Card (AIC)
Expected Max. Power Range • 12 ~ 25W • 25W ~ 40W (1T), 40W ~ 70W (2T) • Similar range compared to E3.S/L
Reference: snia.org
CXL Topology
CXL Host Bridge 1 CXL Host Bridge 2 ● CXL 2.0 hierarchy appears like
CXL 2.0 RP appears
CXL1.1 DP
PCIe hierarchy
PCIe RP CXL 2.0 RP as PCIe CPU
RP ○ Legacy PCI SW and CXL SW
sees a RP or DSP with
PCIe
CXL 1.1
CXL 2.0
CXL
Endpoints below
CXL
CXL
hierarchy
hierarchies
○ CXL link/interface errors are
device
PCIe
○ Switch
e
EP
○ Device
CXL 2.0
Device
D0 F0
device
PCIe
CXL
DVSEC
Interleave
CXL Persistent Memory
Devices
PCI and NVDIMM had a coherent byte addressable
baby… with atomics.
NVDIMM NVME
• Byte • PCIe
addressable configurable
• Direct • Hot
mappable swappable
PMEM
• Persistent memory devices rely on System
CXL Generic
Software for provisioning and management Mem Driver
• CXL 2.0 introduces a standard register
interface OS I/F OS Specific i/f
• A generic memory device driver simplifies
PCIe/CXL Bus
software enabling Driver
• Architecture Elements
• Defined as number of discoverable Capabilities CXL 2.0 MEM
CXL 2.0 Mem I/F
• Capabilities includes Device Status and standard Register i/f
mailboxes, accessed via MMIO registers
• Standardized mailbox commands that cover CXL Attached PMEM
errors/health, alerts, partitioning, passphrases etc.
• Allow Vendor specific extensions
Linux Driver Details
Software
Responsibilities
SW Enumerable Components
CEDT ACPI0017 Host
Root Port
USP (Switch)
DSP (Switch)
Endpoint
Endpoint
Linux Drivers
Enumerable
Enumerable - PCI Class code
- CXL capabilities - ACPI discovery
- CXL hierarchy bus mbox
regs
cxl_core cxl_pci
pmem
cxl_port
memdev
Enumerable
cxl_mem cxl_acpi
- LSA
Root Port
USP (Switch)
DSP (Switch)
Endpoint
Endpoint
cxl_acpi
● Probed like a typical ACPI device
○ { "ACPI0017", (unsigned long) &native_acpi0017 },
● Platform specific CXL enumeration
○ Mostly specified in UEFI, and CXL
● ACPI0017 starts enumeration of CXL ports
○ CEDT
○ “Root level” ports (platform)
○ Hostbridges and root ports
cxl_acpi
CEDT ACPI0017 Host
Root Port
USP (Switch)
DSP (Switch)
Endpoint
Endpoint
cxl_port
● Ports are created for all components with an “upstream port”
○ Hostbridge
○ Switch
○ Endpoint
● Enumeration and control and control of decoder resources
○ Provides as a service for other drivers
cxl_port
CEDT ACPI0017 Host
Root Port
USP (Switch)
DSP (Switch)
Endpoint
Endpoint
cxl_mem
● connects a device enumerated with cxl_pci to CXL.mem.
○ Implements device functionality not handled by cxl_pci
● “exports” if device is CXL.mem routed and enabled
cxl_mem
CEDT ACPI0017 Host
Root Port
USP (Switch)
DSP (Switch)
Endpoint
Endpoint
Enumeration Timeline
Rescan root -> endpoint
cxl_acpi cxl_core
cxl_port cxl_port
Runs
asynchronously
cxl_pci
Persistent Volatile
3 Requires CXL.mem
capability
Regions
● Region
○ Interleave set of devices
○ Parameters (IG, HPA, etc)
○ Stored in the Label Storage Area
● Creation
○ Via sysfs ABI
○ Provisioned offline
■ Manufacture time
● Responsibilities
○ Validating region configuration
○ Programming HDM decoders
Region Validation
1) The ordering of the XHB
CFMWS.InterleaveTargetList[] is
fixed by System Firmware, so
devices must be connected to
the correct host bridge
● Hardware
○ No 2.0 FPGAs available
○ 1.1 is limited use and not readily available.
● Internal Simulation
○ Delays/
○ Can’t work with community
● Prior art
○ nfit_test
○ QEMU CCIX patches
Pre-silicon State of the Art
● Hardware
○ No 2.0 FPGAs available
○ 1.1 is limited use and not readily available.
● Internal Simulation
○ Delays/
○ Can’t work with community
● Prior art
○ nfit_test
○ QEMU CCIX patches This Photo by Unknown Author is licensed under CC BY-SA
CXL Arch Review
CXL Host Bridge 1 CXL Host Bridge 2
CXL 2.0 RP appears
PCIe RP CXL 2.0 RP as PCIe CPU
RP
CXL1.1 DP
PCIe
CXL 1.1
CXL 2.0
CXL
CXL
CXL
hierarchy
hierarchies
device
PCIe
Empty Slot CXL Upstream Switch Port,
Hot add
Appears as PCIe USP
capable
RCiEP RCiEP
CXL 2.0 D0 F0 Or RP
Switch CXL
CXL DSP DVSEC
Appears as PCIe DSP
PCIe DSP CXL 1.1 Device
PCI
e
EP
CXL 2.0
Device
D0 F0
device
PCIe
CXL
DVSEC
PCIe in QEMU
(0:31.1)
RCiEP
Q35 Host bridge
(0:31.0)
RCiEP
What we all know and love Root Port
(0:0.0) Root Port
(0:7.0) (0:1c.0)
Endpoint Endpoint
(20:0.0) (30:0.0)
PCIe ~2014
Endpoint upstream
Endpoint Endpoint
(10:0.0) down down (60:0.0) (90:0.0)
Endpoint Endpoint
(20:0.0) (30:0.0)
● CXL Type 3 device
○ hw/mem/cxl_type3.c
● CXL Root Port
○ hw/pci-bridge/cxl_root_port.c CXL eXpander Bridge
type3 type3
(60:0.0) (90:0.0)
● NVDIMM & PCI had a
baby…
● Inherits from both interfaces
● Mailbox handling
cxl-mailbox-
cxl_pci.ko
libcxl utils
ndctl/cxl Linux • Mailbox QEMU
• IOCTL • Mailbox
MMIO
MMIO
November 10th 2020
QEMU
patches
Linux submitted
Patches
submitted
Spec
Released
● QEMU v3 patches sent
○ v4 is ready for submission
○ Community contributions for DOE, CDAT, and SPDM
○ High bar for adding more
■ Nothing exists quite like a CXL memory device
● Volatile + Persistent capacities
● Interleaving at multiple levels
● Linux phase 1 driver merged in 5.12
○ Phase 2 actively being developed.
○ Community contributions for DOE and CDAT
● Spec issues found and the fixes prototyped in QEMU
Future
CXL RUST FTW!!!!!!!!!!!
https://jobs.intel.com/ShowJob/Id/3089754/Linux-Kernel-Development-Engineer
• Linux • QEMU
• DPA mapping (WIP) • Better tests
• Libnvdimm integration • Upstream/Downstream Ports
• Interleave Support
• Interleave • Host bridge
• Provisioning • switch
• Recognition • More firmware commands
• Hotplug • Hotplug support
• Hot add • Error testing
• Managed remove • Interrupt support
• Asynchronous mailbox • Memory class device overhaul
• -----------------------
• Userspace • Make Q35 CXL capable
• Testing • CXL type 1 and 2 devices
• CXL 1.1