Professional Documents
Culture Documents
Release 19.04
Acknowledgments:
1
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Revision History
2
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Contents
Contents ................................................................................................................... 3
Audience and Purpose ................................................................................................. 4
Test setup ................................................................................................................. 5
Target Configuration .............................................................................................. 5
Initiator 1 Configuration ......................................................................................... 6
Initiator 2 Configuration ......................................................................................... 6
BIOS settings ....................................................................................................... 6
Kernel & BIOS spectre-meltdown information ........................................................... 7
Introduction to SPDK NVMe-oF (Target & Initiator) ......................................................... 8
19.04 NVMe-OF New features .................................................................................... 10
SRQ .................................................................................................................. 10
Test Case 1: SPDK NVMe-oF Target I/O core scaling ..................................................... 11
4k Random Read results ...................................................................................... 14
4k Random Write results ...................................................................................... 16
4k Random Read-Write results .............................................................................. 18
Large Sequential I/O Performance ......................................................................... 20
Test Case 2: SPDK NVMe-oF Initiator I/O core scaling................................................... 22
4k Random Read results ...................................................................................... 25
4k Random Write results ...................................................................................... 27
4k Random Read-Write results .............................................................................. 29
Test Case 3: Linux Kernel vs. SPDK NVMe-oF Latency ................................................... 31
SPDK vs Kernel Target results............................................................................... 34
SPDK vs Kernel Initiator results ............................................................................ 36
Test Case 4: NVMe-oF Performance with increasing # of connections ............................. 38
4k Random Read results ...................................................................................... 40
4k Random Write results ...................................................................................... 41
4k Random Read-Write results .............................................................................. 42
Summary ................................................................................................................ 44
Appendix A .............................................................................................................. 45
3
SPDK NVMe-oF RDMA Performance Report
Release 19.04
The purpose of reporting these tests is not to imply a single “correct” approach, but rather to provide a
baseline of well-tested configurations and procedures that produce repeatable results. This report can
also be viewed as information regarding best known method/practice when performance testing SPDK
NVMe-oF (Target & Initiator).
4
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Test setup
Target Configuration
Item Description
Server Platform SuperMicro SYS-2029U-TN24R4T
CPU Intel® Xeon® Platinum 8180 Processor (38.5MB L3, 2.50 GHz)
Number of cores 28, number of threads 56
Memory Total 376 GBs
Operating System Fedora 28
BIOS 2.1a 08/23/2018
Linux kernel version 5.0.9-100.fc28
SPDK version SPDK 19.04 (f19fea15c)
Storage OS: 1x 120GB Intel SSDSC2BB120G4
Storage Target: 16x Intel® P4600TM P4600x 2.0TB (FW: QDV10150)
(8 on each CPU socket)
NIC 2x 100GbE Mellanox ConnectX-5 NICs. Both ports connected.
1 NIC per CPU socket.
5
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Initiator 1 Configuration
Item Description
Server Platform SuperMicro SYS-2028U TN24R4T+
CPU Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (55MB Cache, 2.20 GHz)
Number of cores 22, number of threads 44 per socket (Both sockets populated)
Memory Total 64GBs
Operating System Fedora 28
BIOS 3.1 06/08/2018
Linux kernel version 5.0.9-100.fc28
SPDK version SPDK 19.04 (f19fea15c)
Storage OS: 1x 240GB INTEL SSDSC2BB240G6
NIC 1x 100GbE Mellanox ConnectX-4 NIC. Both ports connected to Target server.
(connected to CPU socket 0)
Initiator 2 Configuration
Item Description
Server Platform SuperMicro SYS-2028U TN24R4T+
CPU Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (55MB Cache, 2.20 GHz)
Number of cores 22, number of threads 44 per socket (Both sockets populated)
Memory Total 64GBs
Operating System Fedora 28
BIOS 3.1 06/08/2018
Linux kernel version 5.0.9-100.fc28
SPDK version SPDK 19.04 (f19fea15c)
Storage OS: 1x 240GB INTEL SSDSC2BB240G6
NIC 1x 100GbE Mellanox ConnectX-4 NIC. Both ports connected to Target server.
(connected to CPU socket 0)
BIOS settings
Item Description
BIOS Hyper threading Enabled
(Applied to all 3 systems) CPU Power and Performance Policy <Performance>
CPU C-state No Limit
CPU P-state Enabled
Enhanced Intel® SpeedStep® Tech Enabled
Turbo Boost Enabled
6
SPDK NVMe-oF RDMA Performance Report
Release 19.04
7
SPDK NVMe-oF RDMA Performance Report
Release 19.04
The SPDK NVMe-oF target and initiator uses the Infiniband/RDMA verbs API to access an RDMA-capable
NIC. These should work on all flavors of RDMA transports, but are currently tested against RoCEv2,
iWARP, and Omni-Path NICs. Similar to the SPDK NVMe driver, SPDK provides a user-space, lockless,
polled-mode NVMe-oF initiator. The host system uses the initiator to establish a connection and submit
I/O requests to an NVMe subsystem within an NVMe-oF target. NVMe subsystems contain namespaces,
each of which maps to a single block device exposed via SPDK’s bdev layer. SPDK’s bdev layer is a block
device abstraction layer and general purpose block storage stack akin to what is found in many
operating systems. Using the bdev interface completely decouples the storage media from the front-end
protocol used to access storage. Users can build their own virtual bdevs that provide complex storage
services and integrate them with the SPDK NVMe-oF target with no additional code changes. There can
be many subsystems within an NVMe-oF target and each subsystem may hold many namespaces.
Subsystems and namespaces can be configured dynamically via a JSON-RPC interface.
Figure 1 shows a high level schematic of the systems used for testing in the rest of this report. The set up
consists of three individual systems (two used as initiators and one used as the target). The NVMe-oF
target is connected to both initiator systems point-to-point using QSFP28 cables without any switches.
The target system has sixteen Intel P4600 SSDs which were used as block devices for NVMe-oF
subsystems and two 100GbE Mellanox ConnectX-5 NICs connected to provide up to 200GbE of network
bandwidth. Each Initiator system has one Mellanox ConnectX-4 100GbE NIC connected directly to the
target without any switch.
One goal of this report was to make clear the advantages and disadvantages inherent to the design of
the SPDK NVMe-oF components. These components are written using techniques such as run-to
completion, polling, and asynchronous I/O. The report covers four real-world use cases.
For performance benchmarking the fio tool is used with two storage engines:
1) Linux Kernel libaio engine
2) SPDK bdev engine
Performance numbers reported are aggregate I/O per second, average latency, and CPU utilization as a
percentage for various scenarios. Aggregate I/O per second and average latency data is reported from
fio and CPU utilization was collected using sar (systat).
8
SPDK NVMe-oF RDMA Performance Report
Release 19.04
9
SPDK NVMe-oF RDMA Performance Report
Release 19.04
SRQ
Support for per-device Shared Receive Queues in the RDMA transport layer has been added in 19.04
release. It is enabled by default when creating a RDMA transport layer for any device that supports it.
For purpose of creating this and any future SPDK NVMe-oF RDMA report this feature will be enabled.
SRQ reduces the number of communication buffers and improves memory utilization: Instead of using
multiple per-sender queues a single receive queue is used, with its own set of buffers. Enabling SRQ
comes with a slight performance drop as more operations need to be performed on incoming data in
order to properly distribute. Performance is 1-10% lower than for a configuration without SRQ and is
dependent on workload. Example of such drop was presented in Test Case 1 results.
10
SPDK NVMe-oF RDMA Performance Report
Release 19.04
For detailed configuration please refer to table below. Actual configuration of SPDK NVMe-oF Target in
test was done using JSON-RPC and the table contains a sequence of commands used by
spdk/scripts/rpc.py script rather than a configuration file. SPDK NVMe-oF Initiator (bdev fio_plugin) still
uses plain configuration files.
Each workload was run three times at each CPU count and the reported results are the average of the 3
runs. For workloads which need preconditioning (4KB rand write and 4KB 70% Read 30% write we ran
preconditioning once before running all of the workload to ensure that NVMe devices reached higher
IOPS so that we can saturate the network .
Item Description
Test Case Test SPDK NVMe-oF Target I/O core scaling
SPDK NVMe-oF All of below commands are executed with spdk/scripts/rpc.py script.
Target
configuration construct_nvme_bdev -t PCIe -b Nvme0 -a 0000:60:00.0
construct_nvme_bdev -t PCIe -b Nvme1 -a 0000:61:00.0
construct_nvme_bdev -t PCIe -b Nvme2 -a 0000:62:00.0
construct_nvme_bdev -t PCIe -b Nvme3 -a 0000:63:00.0
construct_nvme_bdev -t PCIe -b Nvme4 -a 0000:64:00.0
construct_nvme_bdev -t PCIe -b Nvme5 -a 0000:65:00.0
construct_nvme_bdev -t PCIe -b Nvme6 -a 0000:66:00.0
construct_nvme_bdev -t PCIe -b Nvme7 -a 0000:67:00.0
construct_nvme_bdev -t PCIe -b Nvme8 -a 0000:b5:00.0
construct_nvme_bdev -t PCIe -b Nvme9 -a 0000:b6:00.0
construct_nvme_bdev -t PCIe -b Nvme10 -a 0000:b7:00.0
construct_nvme_bdev -t PCIe -b Nvme11 -a 0000:b8:00.0
construct_nvme_bdev -t PCIe -b Nvme12 -a 0000:b9:00.0
construct_nvme_bdev -t PCIe -b Nvme13 -a 0000:ba:00.0
construct_nvme_bdev -t PCIe -b Nvme14 -a 0000:bb:00.0
construct_nvme_bdev -t PCIe -b Nvme15 -a 0000:bc:00.0
11
SPDK NVMe-oF RDMA Performance Report
Release 19.04
nvmf_create_transport -t RDMA
(creates RDMA transport layer with default values:
trtype: "RDMA"
max_queue_depth: 128
max_qpairs_per_ctrlr: 64
in_capsule_data_size: 4096
max_io_size: 131072
io_unit_size: 8192
max_aq_depth: 128
num_shared_buffers: 4096
buf_cache_size: 32)
12
SPDK NVMe-oF RDMA Performance Report
Release 19.04
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode9 -t rdma -f ipv4 -s 4420 -a 10.0.0.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode10 -t rdma -f ipv4 -s 4420 -a 10.0.0.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode11 -t rdma -f ipv4 -s 4420 -a 10.0.0.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode12 -t rdma -f ipv4 -s 4420 -a 10.0.0.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode13 -t rdma -f ipv4 -s 4420 -a 10.0.1.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode14 -t rdma -f ipv4 -s 4420 -a 10.0.1.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode15 -t rdma -f ipv4 -s 4420 -a 10.0.1.1
nvmf_subsystem_add_listener nqn.2018-09.io.spdk:cnode16 -t rdma -f ipv4 -s 4420 -a 10.0.1.1
FIO.conf
[global]
ioengine=/tmp/spdk/examples/bdev/fio_plugin/fio_plugin
spdk_conf=/tmp/spdk/bdev.conf
thread=1
group_reporting=1
direct=1
norandommap=1
rw=randrw
rwmixread={100, 70, 0}
bs=4k
iodepth={1, 8, 16, 32}
time_based=1
ramp_time=60
runtime=300
[filename0]
filename=Nvme0n1
[filename1]
filename=Nvme1n1
[filename2]
filename=Nvme2n1
[filename3]
filename=Nvme3n1
[filename4]
filename=Nvme4n1
[filename5]
filename=Nvme5n1
[filename6]
filename=Nvme6n1
[filename7]
filename=Nvme7n1
13
SPDK NVMe-oF RDMA Performance Report
Release 19.04
5 cores Not run Not run Not run 16280.91 4167.9 122.6
6 cores Not run Not run Not run 16713.24 4278.6 119.4
14
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 2: SPDK NVMe-oF Target I/O core scaling: IOPS vs. Latency while running 4KB 100% Random read workload
Test Result: 4K 100% Random Read IOPS QD=64, SRQ Disabled vs SRQ Enabled
SRQ Enabled vs
SPDK v19.04, SRQ Disabled SPDK v19.04, SRQ Enabled Disabled Relative
Performance
# of BW Throughput Avg. Lat. BW Throughput Avg. Lat. BW IOPS Avg.
Cores (MBps) (IOPS k) (usec) (MBps) (IOPS k) (usec) (%) (%) Lat. (%)
4
21779.99 5575.7 183.4 19874.40 5087.8 201.3 91.25 91.25 109.72
cores
5
23259.03 5954.3 171.7 21560.06 5519.4 185.5 92.70 92.70 107.98
cores
6
24056.69 6158.5 166.0 21648.04 5541.9 184.9 89.99 89.99 111.38
cores
15
SPDK NVMe-oF RDMA Performance Report
Release 19.04
5 cores Not run Not run Not run 21785.05 5577.0 91.3
6 cores Not run Not run Not run 22466.59 5751.4 88.6
16
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 3: SPDK NVMe-oF Target I/O core scaling: IOPS vs. Latency while running 4KB 100% Random write workload
17
SPDK NVMe-oF RDMA Performance Report
Release 19.04
5 cores Not run Not run Not run 10224.87 2617.6 195.3
6 cores Not run Not run Not run 10242.21 2622.0 195.0
18
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 4: SPDK NVMe-oF Target I/O core scaling: IOPS vs. Latency while running 4KB Random 70% read 30% write workload
Conclusion:
1. For 100% Random Reads, throughput scales up and latency decreases almost linearly with the
scaling of SPDK NVMe-oF Target I/O cores until transitioning from 4 to 5 CPU cores, when
benefit of adding another core is halved. Increasing from 5 to 6 CPU cores does not show any
benefit as we have reached network saturation.
2. For 100% Random Writes, the results are almost the same as for 100% Random Reads, except
that saturation is reached quicker – at 4 CPU cores. Increasing to 5 or 6 CPU cores does not
improve results.
3. For 4K Random 70/30 Reads/Writes, performance doesn’t scale from 3 to more cores due to
some other bottleneck. It was observed that while running this test case locally without
involving any network it can hit > 4M aggregate IOPS. But, while running this over the network,
it could only achieve 3.8M IOPS max. This points to some other platform or network bottleneck
while running this workload.
4. The SRQ transport layer feature that was introduced in 19.04 caused a slight performance
degradation vs. 19.01.1. SRQ is enabled by default in 19.04 can cause up to 10% of performance
degradation. However, SRQ decreases memory utilization significantly so the small degradation
in the IOPS and/or latency is a worthy trade-off. The benefits of using SRQ are realized when the
NVMe-oF target has a high number of connections, which is not the case, nor aim of this
document.
19
SPDK NVMe-oF RDMA Performance Report
Release 19.04
20
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Conclusion:
1. A single CPU core saturated the network bandwidth. The SPDK NVMe-oF target running on 1
core does close to 24 GBps 100% Writes and 25 GBps of 100% Reads. The 70-30 Read/Write
workload bandwidth was 29-30GBps which means 2x 100GbE NICs network bandwidth was
saturated. Therefore, adding more CPU cores did not result in increased performance for these
workloads because the network was the bottleneck.
21
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Depending on number of initiators and initiator cores, we varied the number of NVMe-oF subsystems
exported by the target, instead of exposing all 16 NVMe-oF subsystems. This was done to avoid a
situation where single initiator would connect to all available 16 target subsystems, which resulted in
unnatural latency spike in the 1 CPU test case.
2 cores: 2 separate initiators, each running on a single core. Each initiator connected to 4 subsystems.
3 cores: 2 separate initiators, the first one running on 1 core and other one running on 2 cores. Initiator
1 connected to 4 subsystems and initiator 2 connected to 8 subsystems.
4 cores: 2 separate initiators, both running 2 cores each. Both initiators connected to 8 subsystems.
Item Description
Test Case Test SPDK NVMe-oF InitiatorI/O core scaling
SPDK NVMe-oF Same as in Test Case #1, using 6 CPU cores.
Target
configuration
SPDK NVMe-oF BDEV.conf
Initiator 1 - FIO [Nvme]
plugin TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.0.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode1" Nvme0
configuration TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.0.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode2" Nvme1
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.0.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode3" Nvme2
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.0.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode4" Nvme3
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.1.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode5" Nvme4
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.1.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode6" Nvme5
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.1.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode7" Nvme6
TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.1.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode8" Nvme7
FIO.conf
[global]
ioengine=/tmp/spdk/examples/bdev/fio_plugin/fio_plugin
spdk_conf=/tmp/spdk/bdev.conf
22
SPDK NVMe-oF RDMA Performance Report
Release 19.04
thread=1
group_reporting=1
direct=1
norandommap=1
rw=randrw
rwmixread={100, 70, 0}
bs=4k
iodepth={32, 64, 128, 256}
time_based=1
ramp_time=60
runtime=300
{filename_section}
FIO.conf
Similar as Initiator 1. Different filename section settings (below).
23
SPDK NVMe-oF RDMA Performance Report
Release 19.04
24
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Test Result: 4K 100% Random Read, QD=256, SPDK Target 4 CPU cores
SPDK v19.01.1 SPDK v19.04
Avg. Avg.
# of Bandwidth Throughput Bandwidth Throughput
Latency Latency
Cores (MBps) (IOPS k) (MBps) (IOPS k)
(usec) (usec)
1 core 4968.21 1271.9 195.8 5683.05 1454.9 156.3
Test Result: 4K 100% Random Read, QD=256, SPDK Target 6 CPU cores
SPDK v19.04
25
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 5: SPDK NVMe-oF Initiator I/O core scaling: IOPS vs. Latency while running 4KB 100% Random read workload
26
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Test Result: 4K 100% Random Write, QD=256, SPDK Target 4 CPU Cores
SPDK v19.01.1 SPDK v19.04
Avg. Avg.
# of Bandwidth Throughput Bandwidth Throughput
Latency Latency
Cores (MBps) (IOPS k) (MBps) (IOPS k)
(usec) (usec)
1 core 4826.91 1235.7 197.4 5305.06 1358.1 115.7
Test Result: 4K 100% Random Write, QD=256, SPDK Target 6 CPU Cores
SPDK v19.04
27
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 6: SPDK NVMe-oF Initiator I/O core scaling: IOPS vs. Latency while running 4KB 100% Random Write workload
28
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Test Result: 4K 70% Random Read 30% Random Write QD=256, SPDK Target 6 CPU Cores
SPDK v19.04
29
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 7: SPDK NVMe-oF Initiator I/O core scaling: IOPS vs. Latency while running 4KB Random 70% read 30% write workload
Conclusion:
1. The 4K 100% Random Reads and Random Writes workloads IOPS scaled linearly with each CPU
core added to the initiator. At 4 CPU cores the workload is close to saturating the network
bandwidth.
2. The 4K 70% Random Reads / 30% Random Writes IOPS scale linearly from 1 to 2 CPUs. For 3 and
4 CPUs the performance increase is notlinear, and the results are similar to these from Test Case
1 which suggest some kind of network or platform bottleneck.
30
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Item Description
Test Case Linux Kernel vs. SPDK NVMe-oF Latency
Test configuration
SPDK NVMe-oF All of below commands are executed with spdk/scripts/rpc.py script.
Target
configuration nvmf_create_transport -t RDMA
(creates RDMA transport layer with default values:
trtype: "RDMA"
max_queue_depth: 128
max_qpairs_per_ctrlr: 64
in_capsule_data_size: 4096
max_io_size: 131072
io_unit_size: 8192
max_aq_depth: 128
num_shared_buffers: 4096
buf_cache_size: 32)
31
SPDK NVMe-oF RDMA Performance Report
Release 19.04
}
],
"hosts": [],
"subsystems": [
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nullb0",
"uuid": "621e25d2-8334-4c1a-8532-b6454390b8f9"
},
"enable": 1,
"nsid": 1
}
],
"nqn": "nqn.2018-09.io.spdk:cnode1"
}
]
}
FIO configuration
SPDK NVMe-oF BDEV.conf
[Nvme]
Initiator FIO plugin TransportId "trtype:RDMA adrfam:IPv4 traddr:20.0.0.1 trsvcid:4420 subnqn:nqn.2018-09.io.spdk:cnode1" Nvme0
configuration
FIO.conf
[global]
ioengine=/tmp/spdk/examples/bdev/fio_plugin/fio_plugin
spdk_conf=/tmp/spdk/bdev.conf
thread=1
group_reporting=1
direct=1
norandommap=1
rw=randrw
rwmixread={100, 70, 0}
bs=4k
iodepth=1
time_based=1
ramp_time=60
runtime=300
[filename0]
filename=Nvme0n1
Kernel initiator Device config
configuration Done using nvme-cli tool.
modprobe nvme-fabrics
nvme connect –n nqn.2018-09.io.spdk:cnode1 –t rdma –a 20.0.0.1 –s 4420
FIO.conf
[global]
ioengine=libaio
32
SPDK NVMe-oF RDMA Performance Report
Release 19.04
thread=1
group_reporting=1
direct=1
norandommap=1
rw=randrw
rwmixread={100, 70, 0}
bs=4k
iodepth=1
time_based=1
ramp_time=60
runtime=300
[filename0]
filename=/dev/nvme0n1
33
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 8: Average I/O Latency comparisons between SPDK and Kernel NVMe-oF Target for various workloads using the Kernel
Initiator
34
SPDK NVMe-oF RDMA Performance Report
Release 19.04
4K 100% Random 70%
19.33 50355 24.8 19.02 51736 27.9
Reads 30% Writes IOPS
Conclusion:
1. The SPDK NVMe-oF Target reduces the NVMe-oF average round trip I/O latency (reads/writes)
by up to 7 usec vs. the Linux Kernel NVMe-oF target. This is entirely software overhead.
2. Both Kernel and SPDK Target performance in this test case remains stable since 19.01.1 report.
35
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 9: Average I/O Latency Comparisons between SPDK and Kernel NVMe-oF Initiator for various workloads against SPDK
Target
36
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Conclusion:
1. SPDK NVMe-oF Initiator reduces the NVMe-oF SW overhead by up to 4x vs. the Linux Kernel
NVMe-oF Initiator.
37
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Item Description
Test Case NVMe-oF Target performance under varying # of connections
SPDK NVMe-oF Target Same as in Test Case #1, using 4 CPU cores.
configuration
Kernel NVMe-oF Target Target configuration file loaded using nvmet-cli tool.
configuration For detail configuration file contents please see Appendix A.
Kernel NVMe-oF Device config
Initiator #1 Performed using nvme-cli tool.
modprobe nvme-fabrics
nvme connect –n nqn.2018-09.io.spdk:cnode1 –t rdma –a 20.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode2 –t rdma –a 20.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode3 –t rdma –a 20.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode4 –t rdma –a 20.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode5 –t rdma –a 20.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode6 –t rdma –a 20.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode7 –t rdma –a 20.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode8 –t rdma –a 20.0.1.1 –s 4420
Kernel NVMe-oF Device config
Initiator #2 Performed using nvme-cli tool.
modprobe nvme-fabrics
nvme connect –n nqn.2018-09.io.spdk:cnode9 –t rdma –a 10.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode10 –t rdma –a 10.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode11 –t rdma –a 10.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode12 –t rdma –a 10.0.0.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode13 –t rdma –a 10.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode14 –t rdma –a 10.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode15 –t rdma –a 10.0.1.1 –s 4420
nvme connect –n nqn.2018-09.io.spdk:cnode16 –t rdma –a 10.0.1.1 –s 4420
38
SPDK NVMe-oF RDMA Performance Report
Release 19.04
norandommap=1
rw=randrw
rwmixread={100, 70, 0}
bs=4k
iodepth={8, 16, 32, 64, 128}
time_based=1
ramp_time=60
runtime=300
numjobs={1, 4, 16, 32}
[filename1]
filename=/dev/nvme0n1
[filename2]
filename=/dev/nvme1n1
[filename3]
filename=/dev/nvme2n1
[filename4]
filename=/dev/nvme3n1
[filename5]
filename=/dev/nvme4n1
[filename6]
filename=/dev/nvme5n1
[filename7]
filename=/dev/nvme6n1
[filename8]
filename=/dev/nvme7n1
Number of CPU cores used while running SPDK NVMe-oF target were 4, whereas for the case of Linux
Kernel NVMe-oF target there was no CPU core limitation applied.
Numbers in the graph represent relative performance which are in terms of IOPS/core which was
calculated based on total aggregate IOPS divided by total CPU cores used while running that specific
workload. For the case of Kernel NVMe-oF target, total CPU cores was calculated from % CPU utilization
which was measured using sar utility in Linux.
39
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 10: Relative Performance Comparison of Linux Kernel vs. SPDK NVMe-oF Target for 4K 100% Random Reads using the
Kernel Initiator
40
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 11: Relative Performance Comparison of Linux Kernel vs. SPDK NVMe-oF Target for 4K 100% Random Writes
Note: Drives were not pre-conditioned while running 100% Random write I/O Test
41
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Figure 12: Relative Performance Comparison ofLinux Kernel vs. SPDK NVMe-oF Target for 4K Random 70% Reads 30% Writes
Linux Kernel NVMe-oF Target: 4K 70% Random Read 30% Random Write, QD=32
Kernel 4.20 Kernel 5.0.9
Avg. Avg.
Connections Bandwidth IOPS # CPU Bandwidth IOPS # CPU
Latency Latency
per subsystem (MBps) (k) Cores (MBps) (k) Cores
(usec) (usec)
1 9333.25 2389.3 46.2 8.4 9843.75 2520.0 34.4 8.7
4 19229.42 4922.7 121.0 21.1 19255.24 4929.3 77.9 21.6
SPDK NVMe-oF Target: 4K 70% Random Read 30% Random Write, QD=32
SPDK v19.01.1 SPDK v19.04
Avg. Avg.
Connections Bandwidth IOPS # CPU Bandwidt IOPS # CPU
Latency Latency
per subsystem (MBps) (k) Cores h (MBps) (k) Cores
(usec) (usec)
1 9909.67 2536.9 46.9 4 9689.06 2480.4 43.5 4
42
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Conclusion:
1. The performance of the 4K Random 70/30 Read/Write workload peaked when the number of
connections was 4 and iodepth of 4 per subsystem. This seem to be the optimum configuration
that yield the maximum performance for the 4K 100% Random Reads workload as well.
2. For the 4K Random Read workload, the Linux Kernel NVMe-oF target performance in IOPS, BW
and Latency improved, but at a cost of slightly higher CPU utilization. SPDK performance remains
similar to 19.01.1 for low number of connections, however, there was a slight performance
improvement observed at 16 and 32 connections. These changes are reflected in the chart:
relative performance for 1 and 4 connections is similar to last report, while it has risen for 16
and 32 connections. SPDK NVMe-oF target performance is up to 7 times better than kernel.
3. For the 4K 100% random writes workloads both SPDK and Kernel NVMe-oF target showed
increased performance for a single connection. As the number of connection increased, the
Linux Kernel NVMe-oF target performance did not change, but generally the CPU utilization
decreased slightly between Linux Kernel 4.20 and 5.0.9. SPDK NVMe-oF target performance has
improved since 19.01.1, and the relative performance is up to 9 times better than the Linux
Kernel NVMe-oF target.
4. For 70/30 Random Read/Write workload, the Linux Kernel overall performance increased
noticeably. This could probably be a result of higher random read results (IOPS, BW and Latency
values) and lower CPU utilization for random writes. SPDK performance is comparable to
19.01.1, and the relative performance is up to 5.8 times better than the Linux Kernel NVMe-oF
target.
43
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Summary
This report showcased performance results with SPDK NVMe-oF target and initiator under various test
cases, including:
It compared performance results while running Linux Kernel NVMe-oF (Target/Initiator) against the
accelerated polled-mode driven SPDK NVMe-oF (Target/Initiator) implementation. Like in the last
report, throughput scales up and latency decreases almost linearly with the scaling of SPDK NVMe-oF
target cores. In case of SDPK NVMe-OF Initiator core scaling it was observed that throughput scales
almost linearly, but the latency remains rather stable which might suggest that the Target side might be
able to process slightly more IO.
In case of 4k Random 70% Read 30% Write workload there was an unidentified bottleneck, which
prevents from confirming proper throughput and latency scaling for this workload.
It was also observed that in case of Kernel vs SPDK comparison, the SPDK NVMe-oF Target average
latency is up to 7 usec lower than Kernel when testing against null bdev based backend. The advantage
of SPDK is even greater when comparing NVMe-oF Initiators – SPDK average latency is up to 4 times
lower than Kernel Initiator.
SPDK NVMe-oF Target performed up to 6.5 times better w.r.t IOPS/core than Linux Kernel NVMe-oF
target while running 4K 100% random read workload with increasing number of connections (16) per
NVMe-oF subsystem.
This report provides information regarding methodologies and practices while benchmarking NVMe-oF
using SPDK, as well as the Linux Kernel. It should be noted that the performance data showcased in this
report is based on specific hardware and software configurations and that performance results may vary
depending on different hardware and software configurations.
44
SPDK NVMe-oF RDMA Performance Report
Release 19.04
Appendix A
Example Kernel NVMe-oF Target configuration for Test Case 4.
{
"ports": [
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.0.1",
"trsvcid": "4420",
"trtype": "rdma"
},
"portid": 1,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode1"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.0.1",
"trsvcid": "4421",
"trtype": "rdma"
},
"portid": 2,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode2"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.0.1",
"trsvcid": "4422",
"trtype": "rdma"
},
"portid": 3,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode3"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.0.1",
"trsvcid": "4423",
"trtype": "rdma"
},
"portid": 4,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode4"
45
SPDK NVMe-oF RDMA Performance Report
Release 19.04
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.1.1",
"trsvcid": "4424",
"trtype": "rdma"
},
"portid": 5,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode5"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.1.1",
"trsvcid": "4425",
"trtype": "rdma"
},
"portid": 6,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode6"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.1.1",
"trsvcid": "4426",
"trtype": "rdma"
},
"portid": 7,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode7"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "20.0.1.1",
"trsvcid": "4427",
"trtype": "rdma"
},
"portid": 8,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode8"
]
},
{
"addr": {
"adrfam": "ipv4",
46
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"traddr": "10.0.0.1",
"trsvcid": "4428",
"trtype": "rdma"
},
"portid": 9,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode9"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.0.1",
"trsvcid": "4429",
"trtype": "rdma"
},
"portid": 10,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode10"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.0.1",
"trsvcid": "4430",
"trtype": "rdma"
},
"portid": 11,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode11"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.0.1",
"trsvcid": "4431",
"trtype": "rdma"
},
"portid": 12,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode12"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.1.1",
"trsvcid": "4432",
"trtype": "rdma"
},
"portid": 13,
47
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode13"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.1.1",
"trsvcid": "4433",
"trtype": "rdma"
},
"portid": 14,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode14"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.1.1",
"trsvcid": "4434",
"trtype": "rdma"
},
"portid": 15,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode15"
]
},
{
"addr": {
"adrfam": "ipv4",
"traddr": "10.0.1.1",
"trsvcid": "4435",
"trtype": "rdma"
},
"portid": 16,
"referrals": [],
"subsystems": [
"nqn.2018-09.io.spdk:cnode16"
]
}
],
"hosts": [],
"subsystems": [
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme0n1",
48
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"uuid": "b53be81d-6f5c-4768-b3bd-203614d8cf20"
},
"enable": 1,
"nsid": 1
}
],
"nqn": "nqn.2018-09.io.spdk:cnode1"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme1n1",
"uuid": "12fcf584-9c45-4b6b-abc9-63a763455cf7"
},
"enable": 1,
"nsid": 2
}
],
"nqn": "nqn.2018-09.io.spdk:cnode2"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme2n1",
"uuid": "ceae8569-69e9-4831-8661-90725bdf768d"
},
"enable": 1,
"nsid": 3
}
],
"nqn": "nqn.2018-09.io.spdk:cnode3"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme3n1",
"uuid": "39f36db4-2cd5-4f69-b37d-1192111d52a6"
},
"enable": 1,
49
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"nsid": 4
}
],
"nqn": "nqn.2018-09.io.spdk:cnode4"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme4n1",
"uuid": "984aed55-90ed-4517-ae36-d3afb92dd41f"
},
"enable": 1,
"nsid": 5
}
],
"nqn": "nqn.2018-09.io.spdk:cnode5"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme5n1",
"uuid": "d6d16e74-378d-40ad-83e7-b8d8af3d06a6"
},
"enable": 1,
"nsid": 6
}
],
"nqn": "nqn.2018-09.io.spdk:cnode6"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme6n1",
"uuid": "a65dc00e-d35c-4647-9db6-c2a8d90db5e8"
},
"enable": 1,
"nsid": 7
}
],
50
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"nqn": "nqn.2018-09.io.spdk:cnode7"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme7n1",
"uuid": "1b242cb7-8e47-4079-a233-83e2cd47c86c"
},
"enable": 1,
"nsid": 8
}
],
"nqn": "nqn.2018-09.io.spdk:cnode8"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme8n1",
"uuid": "f12bb0c9-a2c6-4eef-a94f-5c4887bbf77f"
},
"enable": 1,
"nsid": 9
}
],
"nqn": "nqn.2018-09.io.spdk:cnode9"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme9n1",
"uuid": "40fae536-227b-47d2-bd74-8ab76ec7603b"
},
"enable": 1,
"nsid": 10
}
],
"nqn": "nqn.2018-09.io.spdk:cnode10"
},
{
51
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme10n1",
"uuid": "b9756b86-263a-41cf-a68c-5cfb23c7a6eb"
},
"enable": 1,
"nsid": 11
}
],
"nqn": "nqn.2018-09.io.spdk:cnode11"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme11n1",
"uuid": "9d7e74cc-97f3-40fb-8e90-f2d02b5fff4c"
},
"enable": 1,
"nsid": 12
}
],
"nqn": "nqn.2018-09.io.spdk:cnode12"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme12n1",
"uuid": "d3f4017b-4f7d-454d-94a9-ea75ffc7436d"
},
"enable": 1,
"nsid": 13
}
],
"nqn": "nqn.2018-09.io.spdk:cnode13"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
52
SPDK NVMe-oF RDMA Performance Report
Release 19.04
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme13n1",
"uuid": "6b9a65a3-6557-4713-8bad-57d9c5cb17a9"
},
"enable": 1,
"nsid": 14
}
],
"nqn": "nqn.2018-09.io.spdk:cnode14"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme14n1",
"uuid": "ed69ba4d-8727-43bd-894a-7b08ade4f1b1"
},
"enable": 1,
"nsid": 15
}
],
"nqn": "nqn.2018-09.io.spdk:cnode15"
},
{
"allowed_hosts": [],
"attr": {
"allow_any_host": "1",
"version": "1.3"
},
"namespaces": [
{
"device": {
"path": "/dev/nvme15n1",
"uuid": "5b8e9af4-0ab4-47fb-968f-b13e4b607f4e"
},
"enable": 1,
"nsid": 16
}
],
"nqn": "nqn.2018-09.io.spdk:cnode16"
}
]
}
53
SPDK NVMe-oF RDMA Performance Report
Release 19.04
DISCLAIMERS
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,
EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY
THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS,
INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,
RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO
FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR
OTHER INTELLECTUAL PROPERTY RIGHT.
You may not use or facilitate the use of this document in connection with any infringement or other legal analysis
concerning Intel products described herein.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the
results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products.
Intel® AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute
the instructions in the correct sequence. AES-NI is available on select Intel® processors. For availability, consult your
reseller or system manufacturer. For more information, see http://software.intel.com/en-us/articles/intel-
advanced-encryption-standard-instructions-aes-ni/
The benchmark results may need to be revised as additional testing is conducted. The results depend on the
specific platform configurations and workloads utilized in the testing, and may not be applicable to any particular
user's components, computer system or workloads. The results are not necessarily representative of other
benchmarks and other benchmark results may show greater or lesser impact from mitigations.
Intel and the Intel logo are trademarks of Intel Corporation in the US and other countries
54