Thanks for the explanation. I am not sure if the problem is caused by the
amount of memory to use specified in the primary process. Now, I tried to
only specify the memory size in perf, but I still get the same problem. All
outputs have been attached at the bottom.
I will fill a bug in GitHub to track this.
Thank you,
Yue
The output for nvme-of:
./nvmf_tgt -c nvmf.conf.in -i 1 -s 2048
Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: nvmf -c 0x1 -m 2048 --base-virtaddr=0x200000000000
--file-prefix=spdk1 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /run/user/1003/dpdk/spdk1/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
app.c: 612:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 298:_spdk_reactor_run: *NOTICE*: Reactor started on core 0
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
The output for perf:
./perf -q 64 -o 1048576 -w read -t 5 -r 'trtype:PCIe traddr:0000:04:00.0'
-i 1 -c 4
Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: perf -c 4 --base-virtaddr=0x200000000000
--file-prefix=spdk1 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket
/run/user/1003/dpdk/spdk1/mp_socket_23851_100d594ccc17fe
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attached to NVMe Controller at 0000:04:00.0 [8086:0953]
after io_queue_size: 256
controller IO queue size 256 less than required
Consider using lower queue depth or small IO size because IO requests may
be queued at the NVMe driver.
WARNING: Some requested NVMe devices were skipped
Associating INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) with lcore 2
Initialization complete. Launching workers.
Starting thread on core 2
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
starting I/O failed
========================================================
Latency(us)
Device Information : IOPS
MB/s Average min max
INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) from core 2: 2631.40
2631.40 21298.32 13186.93 25910.60
========================================================
Total : 2631.40
2631.40 21298.32 13186.93 25910.60
On Wed, Jan 9, 2019 at 6:50 AM Harris, James R <james.r.harris(a)intel.com>
wrote:
I think the problem is that only the primary process can specify the
amount of memory to use. Secondary processes pull from the same pool of
memory specified by the primary process. So if you specify more memory for
nvmf_tgt, it should work.
We should probably print at least a warning message if a user specifies -s
in a secondary process. We would have to check it after rte_eal_init has
finished. Would you mind filing a bug in GitHub to track this?
Thanks,
-Jim
On 1/8/19, 5:36 PM, "SPDK on behalf of Yue Zhu" <
spdk-bounces(a)lists.01.org on behalf of yuezhu1103(a)gmail.com> wrote:
I noticed one thing that the problem disappear if the transfer size is
smaller (e.g. 64 KB, 256KB). I am not sure if this is the sign for
short
of hugepages on the secondary process.
Yue
On Tue, Jan 8, 2019 at 10:56 AM Yue Zhu <yuezhu1103(a)gmail.com> wrote:
> Hi Tom,
>
> Many thanks for the hint. Since the NVMe-oF is the primary process, I
> assigned 256 MB hugepage during the initialization (by -s 256). For
perf, I
> assigned 8192 MB hugepage (by -s 1048576).
>
> However, I am still facing the same problem. Please feel free to
correct
> me if this is the correct way for using the hugepages.
>
> The new command for running perf:
> ./perf -q 64 -o 1048576 -w read -t 2 -r 'trtype:PCIe
traddr:0000:04:00.0'
> -s 1048576 -i 1
>
> The command for running NVMe-oF:
> ./nvmf_tgt -c nvmf.conf.in -i 1 -s 256
>
> Thanks again,
> Yue
>
>
> On Tue, Jan 8, 2019 at 10:14 AM Nabarro, Tom <tom.nabarro(a)intel.com>
> wrote:
>
>> Could it be that the primary is using all your hugepages? I think
I've
>> experienced that before.
>>
>> -----Original Message-----
>> From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of Yue Zhu
>> Sent: Tuesday, January 8, 2019 5:24 AM
>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>> Subject: [SPDK] Sharing an NVMe SSD with Multiple Processes
>>
>> Hi all,
>>
>> I have some questions regarding sharing an NVMe SSD with multiple
>> processes. It would be greatly appreciated if anyone can offer any
hints.
>>
>> multiple processes can share an SSD if they have the same shared
memory
>> group ID. I noticed an error reported by the built-in perf tool when
>> running perf with nvme-of on the same node.
>>
>> The output of nvme-of is (the configuration file is attached at the
>> bottom):
>> ./nvmf_tgt -c nvmf.conf.in -i 1
>> Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
>> [ DPDK EAL parameters: nvmf -c 0x1 --base-virtaddr=0x200000000000
>> --file-prefix=spdk1 --proc-type=auto ]
>> EAL: Detected 40 lcore(s)
>> EAL: Detected 2 NUMA nodes
>> EAL: Auto-detected process type: PRIMARY
>> EAL: Multi-process socket /run/user/1003/dpdk/spdk1/mp_socket
>> EAL: No free hugepages reported in hugepages-1048576kB
>> EAL: Probing VFIO support...
>> app.c: 612:spdk_app_start: *NOTICE*: Total cores available: 1
>> reactor.c: 298:_spdk_reactor_run: *NOTICE*: Reactor started on core
0
>> EAL: PCI device 0000:04:00.0 on NUMA socket 0
>> EAL: probe driver: 8086:953 spdk_nvme
>>
>> The output of perf:
>> ./perf -q 64 -o 1048576 -w read -t 10 -r 'trtype:PCIe
traddr:0000:04:00.0'
>> -i 1 -c 2
>> Starting SPDK v19.01-pre / DPDK 18.08.0 initialization...
>> [ DPDK EAL parameters: perf -c 2 --base-virtaddr=0x200000000000
>> --file-prefix=spdk1 --proc-type=auto ]
>> EAL: Detected 40 lcore(s)
>> EAL: Detected 2 NUMA nodes
>> EAL: Auto-detected process type: SECONDARY
>> EAL: Multi-process socket
>> /run/user/1003/dpdk/spdk1/mp_socket_138421_f1189dcc15745
>> EAL: Probing VFIO support...
>> Initializing NVMe Controllers
>> EAL: PCI device 0000:04:00.0 on NUMA socket 0
>> EAL: probe driver: 8086:953 spdk_nvme
>> Attached to NVMe Controller at 0000:04:00.0 [8086:0953] after
>> io_queue_size: 256 controller IO queue size 256 less than required
Consider
>> using lower queue depth or small IO size because IO requests may be
queued
>> at the NVMe driver.
>> WARNING: Some requested NVMe devices were skipped Associating INTEL
>> SSDPEDMD400G4 (PHFT620400TJ400BGN ) with lcore 1 Initialization
complete.
>> Launching workers.
>> Starting thread on core 1
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> starting I/O failed
>> ========================================================
>>
>> Latency(us)
>> Device Information : IOPS
>> MB/s Average min max
>> INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) from core 1: 2547.60
>> 2547.60 21983.72 15132.71 43174.97
>> ========================================================
>> Total : 2547.60
>> 2547.60 21983.72 15132.71 43174.97
>>
>> The 8 starting I/O failures (the last 8 of 64 in the first round of
>> submission) are caused by spdk_nvme_ns_cmd_read_with_md. The error
is
>> -ENOMEM. It seems like there is not enough memory for the
submission queue.
>> However, the queue length is 256 (obtained by
>> spdk_nvme_ctrlr_get_default_io_qpair_opts()). I wonder if anyone
has any
>> ideas about how to solve this error.
>>
>> SPDK commit: 3947bc2492c7e66b7f9cb55b30857afc5801ee8d
>>
>> NVMe-oF configuration file:
>> [Nvmf]
>> AcceptorPollRate 10000
>>
>> [Transport]
>> Type RDMA
>>
>> [Nvme]
>> TransportID "trtype:PCIe traddr:0000:04:00.0" Nvme0
>>
>> RetryCount 4
>> TimeoutUsec 0
>> ActionOnTimeout None
>> AdminPollRate 100000
>>
>> HotplugEnable No
>>
>> [Subsystem1]
>> NQN nqn.2016-06.io.spdk:cnode1
>> Listen RDMA 10.10.16.9:4420
>> AllowAnyHost Yes
>> Host nqn.2016-06.io.spdk:init
>> SN SPDK00000000000001
>> MaxNamespaces 20
>> Namespace Nvme0n1 1
>>
>>
>> Thank you,
>> Yue
>> _______________________________________________
>> SPDK mailing list
>> SPDK(a)lists.01.org
>>
https://lists.01.org/mailman/listinfo/spdk
>>
---------------------------------------------------------------------
>> Intel Corporation (UK) Limited
>> Registered No. 1134945 (England)
>> Registered Office: Pipers Way, Swindon SN3 1RJ
>> VAT No: 860 2173 47
>>
>> This e-mail and any attachments may contain confidential material
for
>> the sole use of the intended recipient(s). Any review or
distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>> _______________________________________________
>> SPDK mailing list
>> SPDK(a)lists.01.org
>>
https://lists.01.org/mailman/listinfo/spdk
>>
>
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk