RDMA qp recovery: follow up on latest changes
by Philipp Skadorov
Hello Benjamin,
There have been a series of changes to the QP error recovery that I'd like to follow up:
** c3756ae3 nvmf: Eliminate spdk_nvmf_rdma_update_ibv_qp
ibv_query_qp results in a write syscall to the SNIC driver which is a context switch.
The original code was calling it when the state is known to change; the cached value was used everywhere else.
** 65a512c6 nvmf/rdma: Combine spdk_nvmf_rdma_qp_drained and spdk_nvmf_rdma_recover
** 3bec6601 nvmf/rdma: Simplify spdk_nvmf_rdma_qp_drained
** a9b9f0952d6a0c1a37e544ef2977e7db136a8e86 nvmf/rdma: Don't trigger error recovery on IBV_EVENT_SQ_DRAINED
De-allocating the resources associated with the requests being processed by SNIC at the time of IBV_EVENT_QP_FATAL is too early.
The IBV standard requires SPDK should "wait for the Affiliated Asynchronous Last WQE Reached Event" before manipulating with the QP state.
Would also assume it is unsafe to return the associated requests and their data back to the pools before "Last WQE Reached" event is received.
Thanks,
Philipp
2 years, 6 months
NVMeF host access control
by Shahar Salzman
Hi,
We are contemplating how access control should be implemented for NVMeF hosts.
We are not exposing real NVMeF devices, but rather a virtualized address space from our user space appliance via our bdev driver. We currently expose a single namespace, but are not sure that this is the correct approach along the road.
By access control we mean that each host can only access a subset of the namespaces which it has permission to access.
As I see it there are 2 approaches:
- Exposing a single subsystem, and implementing access control (e.g. filtering each command by IP)
- Exposing multiple subsystems, one per access control list (e.g. one or more IPs), and using the already implemented white list.
Which of the approaches seems to make more sense?
Can exposing multiple subsystems cause issues when implemented at scale (hundred/thousands of subsystems or more)?
Shahar
2 years, 6 months
SPDK User Space Driver and NVMe-oF Performance Query
by Yue Zhu
Hi all,
I am collecting my NVMe SSD performance with SPDK user space driver
and NVMe-oF via the SPDK build-in *perf* tool. However, the results I
gathered are hard to explain. I attach the results below.
io depth 1 4 16 64 256
*local *(MB/s) 518.9 1303.27 2924.06 3061.99 3061.99
*remote *(MB/s) 2701.31 5792.93 5813.61 5812.5 5698.49
Note: 1. *local* means accessing local SSD via SPDK user space driver;
*remote* means accessing remote SSD via SPDK NVMe-oF.
2. above two tests access the same SSD.
My confusion comes from the performance difference between the *local*
and *remote
*tests. I thought the performance for *local* and *remote* should be
similar, or the performance of *remote* may be even lower than the *local*.
I wonder if anyone has any ideas about why this happens.
I also attached the command I used and tests' output at the bottom.
Any hints would be appreciated. Thank you all in advance.
Best,
Yue
The output local test:
sudo ./perf -q 1 -o 65535 -w read -t 5 -c 2 -r 'trtype:PCIe
traddr:0000:04:00.0'
Starting SPDK v18.10-pre / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: perf --no-shconf -c 2 --legacy-mem
--file-prefix=spdk_pid37520 ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid37520/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
Attaching to NVMe Controller at 0000:04:00.0
Attached to NVMe Controller at 0000:04:00.0 [8086:0953]
Associating INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) with lcore 1
Initialization complete. Launching workers.
Starting thread on core 1
========================================================
Latency(us)
Device Information : IOPS
MB/s Average min max
INTEL SSDPEDMD400G4 (PHFT620400TJ400BGN ) from core 1: 8302.60
518.90 120.43 23.52 3022.46
========================================================
Total : 8302.60
518.90 120.43 23.52 3022.46
*The output of remote test:*
sudo ./perf -q 1 -o 65536 -w read -t 5 -c 2 -r 'trtype:RDMA adrfam:IPv4
traddr:inv09ib trsvcid:4420'
Starting SPDK v18.10-pre / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: perf --no-shconf -c 2 --no-pci --legacy-mem
--file-prefix=spdk_pid3910 ]
EAL: Detected 40 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid3910/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
Attaching to NVMe over Fabrics controller at 10.10.16.9:4420:
nqn.2016-06.io.spdk:cnode1
Attached to NVMe over Fabrics controller at 10.10.16.9:4420:
nqn.2016-06.io.spdk:cnode1
Associating SPDK bdev Controller (SPDK00000000000001 ) with lcore 1
Initialization complete. Launching workers.
Starting thread on core 1
========================================================
Latency(us)
Device Information : IOPS
MB/s Average min max
SPDK bdev Controller (SPDK00000000000001 ) from core 1: 43221.00
2701.31 23.11 22.11 154.14
========================================================
Total : 43221.00
2701.31 23.11 22.11 154.14
System info: Centos 7.2 ; kernel version: 3.10.0-327.18.2.el7.x86_64
2 years, 6 months
New Patch for NVMf Transport Opts
by John Barnard
The previous patch (416570) I submitted for moving the nvmf target opts to
the transport has an issue with backward compatibility. To resolve this
issue, I have submitted a new patch (423329) that only contains the updates
to store the target opts in transport structure instead of the target
structure. It still uses the same methods to create (i.e. add listener)
and add a transport to the pollers. There are no changes to the nvmf_tgt
app or RPC code. If this is acceptable, I will followup with a second
patch that adds the functions to independently create a transport and add
it to a target (and pollers). This is needed for the FC transport
creation. Both of these patches will maintain backward compatibility with
the existing nvmf_tgt and RPC functions.
Thanks,
John
2 years, 6 months
Staying Connected with the SPDK Community
by Walker, Benjamin
Hi everyone,
I wanted to take a moment to give everyone a brief overview of where all of the
activity happens within the wider SPDK community to make sure everyone
subscribes to everything of interest to them. The first place, of course, is
this mailing list, but everyone receiving this is already subscribed to that.
Bugs are currently tracked via GitHub issues and there is a ton of activity and
discussion happening there daily. You can get email notifications any time an
update is made by going to https://github.com/spdk/spdk and clicking "Watch" in
the top right. Much of the discussion is general bug fixing activity, but
occasionally explanations for fixes dive into some of the deeper concepts of
SPDK and can be fairly insightful.
Patches are submitted and reviewed through GerritHub (
https://review.gerrithub.io). You can subscribe to email notifications there
too, but beware that it is a very active project and your inbox may get flooded.
Most of the reviewers set up complex dashboards to look at reviews, but a very
simple one is the following:
https://review.gerrithub.io/q/project:spdk%252Fspdk+status:open
Most reviewers use queries to poll GerritHub for new and updated patches
(because it's more efficient than email interrupts!).
Backlogs and task tracking are handled by Trello (https://trello.com/spdk).
That's a good place to see what everyone is currently working on or what needs
to be done, but there isn't much design discussion occurring there.
The final place discussion occurs is on IRC. Specifically #spdk on FreeNode.
Due to some spam-bot attacks on FreeNode, the #spdk channel only allows in
registered users. You can register your IRC nickname with a command like:
/msg NickServ register <password> <email-address>
All of this information is available at http://www.spdk.io/community/ for
reference.
Thanks,
Ben
2 years, 6 months
NVMeoF Security
by Gruher, Joseph R
The NVMeoF spec includes two methods for authentication, section 6 in the 1.0a spec, but neither are actually implemented in the Linux kernel NVMeoF stack today. Some users have expressed concerns about security.
Does SPDK support NVMeoF authentication, and if so which kind(s)? Or if not, do we plan to support in the future?
Thanks,
Joe
2 years, 6 months
Inline/Incapsule data support for spdk NVMF host.
by Potnuri Bharat Teja
Hi All,
Current SPDK supports Inline/Incapsule data on target side only.
Can this be added to Host as well? I would like to add it to the host side.
I am just starting with spdk, it would be nice if this topic is added to the
trello ToDo board or a github issue to discuss further on this.
Thanks,
Bharat.
2 years, 6 months