Hi all,
I have some questions regarding sharing an NVMe SSD with multiple
processes. It would be greatly appreciated if anyone can offer any hints.
multiple processes can share an SSD if they have the same shared memory
group ID. I noticed an error reported by the built-in perf tool when
running perf with nvme-of on the same node.
The output of nvme-of is (the configuration file is attached at the bottom):
./nvmf_tgt -c nvmf.conf.in -i 1
Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: nvmf -c 0x1 --base-virtaddr=0x200000000000
--file-prefix=spdk1 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /run/user/1003/dpdk/spdk1/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
app.c: 612:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 298:_spdk_reactor_run: *NOTICE*: Reactor started on core 0
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
The output of perf:
./perf -q 64 -o 1048576 -w read -t 10 -r 'trtype:PCIe traddr:0000:04:00.0'
-i 1 -c 2
Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: perf -c 2 --base-virtaddr=0x200000000000
--file-prefix=spdk1 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket
/run/user/1003/dpdk/spdk1/mp_socket_138421_f1189dcc15745
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attached to NVMe Controller at 0000:04:00.0 [8086:0953]
after io_queue_size: 256
controller IO queue size 256 less than required
Consider using lower queue depth or small IO size because IO requests may
be queued at the NVMe driver.
WARNING: Some requested NVMe devices were skipped
Associating INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) with lcore 1
Initialization complete. Launching workers.
Starting thread on core 1
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
========================================================
Latency(us)
Device Information : IOPS
MB/s Average min max
INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) from core 1: 2547.60
2547.60 21983.72 15132.71 43174.97
========================================================
Total : 2547.60
2547.60 21983.72 15132.71 43174.97
The 8 starting I/O failures (the last 8 of 64 in the first round of
submission) are caused by spdk_nvme_ns_cmd_read_with_md. The error is
-ENOMEM. It seems like there is not enough memory for the submission queue.
However, the queue length is 256 (obtained by
spdk_nvme_ctrlr_get_default_io_qpair_opts()). I wonder if anyone has any
ideas about how to solve this error.
SPDK commit: 3947bc2492c7e66b7f9cb55b30857afc5801ee8d
NVMe-oF configuration file:
[Nvmf]
AcceptorPollRate 10000
[Transport]
Type RDMA
[Nvme]
TransportID "trtype:PCIe traddr:0000:04:00.0" Nvme0
RetryCount 4
TimeoutUsec 0
ActionOnTimeout None
AdminPollRate 100000
HotplugEnable No
[Subsystem1]
NQN nqn.2016-06.io.spdk:cnode1
Listen RDMA 10.10.16.9:4420
AllowAnyHost Yes
Host nqn.2016-06.io.spdk:init
SN SPDK00000000000001
MaxNamespaces 20
Namespace Nvme0n1 1
Thank you,
Yue