Jenkins CI Triggering Change Now In Effect
by Luse, Paul E
All-
So I mentioned earlier that we'd be making some changes this week - Karol just flipped the switch a few minutes ago so if all hell breaks loose it's his fault OK?
Just kidding of course, lots of great work on this but as I mentioned before it's difficult to fully test these changes before implementing for a variety of reasons so please keep an eye on your patches and let us know if you see anything "odd" like it not getting run by Jenkins or getting run multiple times w/o any apparent reason, etc. Note that the queue is deep right on the Jenkins pool because our new script dug up some older patches that slipped through w/the previous method. So, there may be a delay for the next few hours or so (hard to say).
Anyway, thanks again for your patience and please report anything that looks out of the ordinary!
Thx
Paul
2 years, 2 months
SPDK Euro Meeting 11/12: vhost-user target with vvu support
by Nikos Dragazis
Hi all,
yesterday in the conference meeting I had the chance to talk about the
work I am doing here in Arrikto Inc. Let me give an overview for those
who missed it.
I am working on the SPDK vhost target. I am trying to extend the
vhost-user transport mechanism to enable deploying the SPDK vhost target
into a dedicated storage appliance VM instead of host user space. My end
goal is to have a setup where a storage appliance VM offers emulated
storage devices to compute VMs.
To make this more clear, the topology looks like this:
https://www.dropbox.com/s/gdskob7lgtlwlio/spdk_vhost_vvu_support.svg?dl=0
The code is here:
https://github.com/ndragazis/spdk
I think this is important for two reasons:
- in a cloud environment security really matters. So, running the SPDK
vhost target inside a VM instead of host user space is definitely
better in terms if security.
- this will enable the users in the cloud to create their own storage
devices for their compute VMs. This was not possible with the previous
topology because running the SPDK vhost target on host user space
could only be done by the cloud provider. So, with this topology,
users can create their own custom storage devices because they can run
themselves the SPDK vhost app.
Getting into more detail about how it works:
Moving the vhost target from host user space to a VM creates three
issues with the vhost-user transport mechanism that need to be solved.
1. We need a way so that the vhost-user messages can reach the SPDK
vhost target.
2. We need a mechanism so that the SPDK vhost target can have access to
the Compute VM’s file backed memory.
3. We need a way for the SPDK vhost target to interrupt the compute VM.
These are all solved by a special virtio device called
“virtio-vhost-user”. This device was created by Stefan Hajnoczi and is
described here:
https://wiki.qemu.org/Features/VirtioVhostUser
This device solves the above problems as follows:
1. it reads the messages from the unix socket and passes them into a
virtqueue. A user space driver in SPDK receives the messages from the
virtqueue. The received messages are then passed to the SPDK
vhost-user message handler.
2. it maps the vhost memory regions sent by the master with message
VHOST_USER_SET_MEM_TABLE. The vvu device exposes those regions to the
guest as a PCI memory region.
3. it intercepts the VHOST_USER_SET_VRING_CALL messages and saves the
callfds for the virtqueues. For each virtqueue, it exposes a doorbell
to the guest. When this doorbell is kicked from the SPDK vvu driver,
the device kicks the corresponding callfd.
Changes in the API:
Currently, the rte_vhost library provides both transports, the one
inserted from me and the pre-existing one. The choice for the transport
is being made using a new command line option “T” when running the vhost
app. This option can take two values: “vvu” or “unix” which correspond
to the two transports. When vvu transport is being used, the “S” option
has to be the PCI address of the virtio-vhost-user device. When unix
transport is being used, the “S” option is the directory path where the
unix socket will be created.
Step-by-step guide to test it yourself:
SPDK version: https://github.com/ndragazis/spdk
QEMU version: https://github.com/stefanha/qemu/tree/virtio-vhost-user
1. Compile QEMU and SPDK:
$ git clone -b virtio-vhost-user https://github.com/stefanha/qemu
$ (cd qemu && ./configure --target-list=x86_64-softmmu && make)
$ git clone https://github.com/ndragazis/spdk.git
$ cd spdk
$ git submodule update --init
$ ./configure
$ make
2. Launch the Storage Appliance VM:
$ ./qemu/x86_64-softmmu/qemu-system-x86_64 \
-machine q35,accel=kvm -cpu host -smp 2 -m 4G \
-drive if=none,file=image.qcow2,format=qcow2,id=bootdisk \
-device virtio-blk-pci,drive=bootdisk,id=virtio-disk1,bootindex=0,addr=04.0 \
-device virtio-scsi-pci,id=scsi0,addr=05.0 \
-drive file=scsi_disk.qcow2,if=none,format=qcow2,id=scsi_disk \
-device scsi-hd,drive=scsi_disk,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \
-drive file=nvme_disk.qcow2,if=none,format=qcow2,id=nvme_disk \
-device nvme,drive=nvme_disk,serial=1,addr=06.0 \
-chardev socket,id=chardev0,path=vhost-user.sock,server,nowait \
-device virtio-vhost-user-pci,chardev=chardev0,addr=07.0
The SPDK code needs to be accessible to the guest in the Storage
Appliance VM. A simple solution would be mounting the corresponding host
directory with sshfs, but it’s up to you.
3. Create the SPDK vhost SCSI target inside the Storage Appliance VM:
$ sudo modprobe vfio enable_unsafe_noiommu_mode=1
$ sudo modprobe vfio-pci
$ cd spdk
$ sudo scripts/setup.sh
$ sudo app/vhost/vhost -S "0000:00:07.0" -T "vvu" -m 0x3 &
$ sudo scripts/rpc.py construct_vhost_scsi_controller --cpumask 0x1 vhost.0
$ sudo scripts/rpc.py construct_virtio_pci_scsi_bdev 0000:00:05.0 VirtioScsi0
$ sudo scripts/rpc.py construct_nvme_bdev -b NVMe1 -t PCIe -a 0000:00:06.0
$ sudo scripts/rpc.py construct_malloc_bdev 64 512 -b Malloc0
$ sudo scripts/rpc.py add_vhost_scsi_lun vhost.0 0 VirtioScsi0t0
$ sudo scripts/rpc.py add_vhost_scsi_lun vhost.0 1 NVMe1n1
$ sudo scripts/rpc.py add_vhost_scsi_lun vhost.0 2 Malloc0
4. Launch the Compute VM:
$ ./qemu/x86_64-softmmu/qemu-system-x86_64 \
-M accel=kvm -cpu host -m 1G \
-object memory-backend-file,id=mem0,mem-path=/dev/shm/ivshmem,size=1G,share=on \
-numa node,memdev=mem0 \
-drive if=virtio,file=image.qcow2,format=qcow2 \
-chardev socket,id=chardev0,path=vhost-user.sock \
-device vhost-user-scsi-pci,chardev=chardev0
5. Ensure that the virtio-scsi HBA and the associated SCSI targets are
visible in the Compute VM:
$ lsscsi
I will submit the code for review in GerritHub soon. Hopefully, we can
get this upstream with your help!
Thanks,
Nikos
2 years, 2 months
RDMA QP leak on spdk nvmf target
by Potnuri Bharat Teja
Hi All,
With recent spdk code, RDMA QPs are not destroyed on the nvmf target after each
spdk perf run until target is cleared/stopped.
SPDK nvmf target is using drain WR completion logic to destroy the RDMA QP. But
even after send/recv drain completions are received RDMA QP is not destroyed as
the rqpair->refcnt is 0 in spdk_nvmf_rdma_qpair_destroy().
I believe rqpair refcnt needs to be incremented before spdk_nvmf_rdma_qpair_destroy().
Here is my experimental patch that fixes the issue, Please let me know if this
qualifies for a patch.
--- a/lib/nvmf/rdma.c
+++ b/lib/nvmf/rdma.c
@@ -2586,9 +2595,10 @@ spdk_nvmf_rdma_poller_poll(struct spdk_nvmf_rdma_transport *rtransport,
case RDMA_WR_TYPE_DRAIN_RECV:
rqpair = SPDK_CONTAINEROF(rdma_wr, struct spdk_nvmf_rdma_qpair, drain_recv_wr);
assert(rqpair->disconnect_flags & RDMA_QP_DISCONNECTING);
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Drained QP RECV %u (%p)\n", rqpair->qpair.qid, rqpair);
rqpair->disconnect_flags |= RDMA_QP_RECV_DRAINED;
if (rqpair->disconnect_flags & RDMA_QP_SEND_DRAINED) {
+ spdk_nvmf_rdma_qpair_inc_refcnt(rqpair);
spdk_nvmf_rdma_qpair_destroy(rqpair);
}
/* Continue so that this does not trigger the disconnect path below. */
@@ -2596,9 +2606,10 @@ spdk_nvmf_rdma_poller_poll(struct spdk_nvmf_rdma_transport *rtransport,
case RDMA_WR_TYPE_DRAIN_SEND:
rqpair = SPDK_CONTAINEROF(rdma_wr, struct spdk_nvmf_rdma_qpair, drain_send_wr);
assert(rqpair->disconnect_flags & RDMA_QP_DISCONNECTING);
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Drained QP SEND %u (%p)\n", rqpair->qpair.qid, rqpair);
rqpair->disconnect_flags |= RDMA_QP_SEND_DRAINED;
if (rqpair->disconnect_flags & RDMA_QP_RECV_DRAINED) {
+ spdk_nvmf_rdma_qpair_inc_refcnt(rqpair);
spdk_nvmf_rdma_qpair_destroy(rqpair);
}
/* Continue so that this does not trigger the disconnect path below. */
----
Thanks,
Bharat.
2 years, 2 months
Jenkins CI Status & Next Steps
by Luse, Paul E
All,
We're getting pretty close to considering our Jenkins instance ready to be our primary CI. We won't actually power down the Chandler system though until after the holidays. This week is a big week as we reconfigure how Jenkins identifies which patches need to be run. Long story short, there's a few different ways you can configure Jenkins to pick up a job from Gerrit and we've discovered the hard way over the few months that what we've been doing isn't exactly rock-solid and the mechanism that the Jenkins folks have to remedy the situation isn't supported from our Gerrit hosting service as it needs a special plug-in on that end. So....
We think we have a script that will meet our needs ready within the next few days. We will be switching over to it sometime this week - in the event that we have to switch back and forth between triggering mechanisms there may be patches that don't get immediately picked up by Jenkins. Please bear with us, this will be a transition phase thing if it even happens at all.
Will let y'all know when we believe Jenkins is operating the way we want it. In the meantime though feel free to ask questions on IRC or the dist list. Once we call it "production" we'll want issues like missed patches or whatever filed as GitHub issues.
Thanks for your patience and understanding!
Paul
2 years, 2 months
[CI] Opening Geritt for external CI
by Sasha Kotchubievsky
Hi,
We (Mellanox) see a big value in SPDK project and have started to
invest developing effort to bring advanced HW offloads to the mostly
NVME-OF area, but not limited. To support this developing effort, we're
bringing up internal CI. We would like to contribute to SPDK testing,
detect and to report found issues before patch merging. For example, set
“-1,1” for patches. Following discussions in SPDK Dev Meetup in October
2018, we would like to start integration between our CI and upstream
Gerrit. Can you grant access for our CI for fetching patches and
reporting status?
Best regards
Sasha
2 years, 2 months
SPDK and non-ssd hard disk
by mostaan fereidooni
Hi
I would like to improve my C project's I/O speed and one of my options is
SPDK libraries. Unfortunately there is an ambiguity that has not been
clearing even after seeking SPDK documents. According to SPDK documents
which is located in https://spdk.io/doc/index.html and it's explanation in
https://spdk.io/doc/about.html one of it's bedrock is NVMe driver (which
has been designed to capitalize on the low latency and internal parallelism
of solid-state storage devices.). In addition in aforementioned link, there
is SSD internal section that it's description is intended to help software
developers understand what is occurring inside the SSD. So it seems that
SPDK designation is for working with SSD storage devices but I doubt to my
conclusion. So any help to make a definite conclusion about Can I use SPDK
libraries for my non-SSD hard disk, would be appreciated.
2 years, 2 months
FreeBSD CI environment
by Sztyber, Konrad
Hi all,
I've pushed a patch recently (https://review.gerrithub.io/#/c/spdk/spdk/+/436319/) that adds parameters to compilation/linking of unit tests to remove unused functions from the final executables. It's rather useful, because it allows to omit mocking functions that aren't called by tested code. However, the problem with that is that it breaks the CI on the FreeBSD machine, because the linker hits some kind of assertion and crashes. The default linker used there is pretty old (BFD 2.17.50 [FreeBSD] 2007-07-03) and I've found that installing gcc and using it to build SPDK fixes it. Would it be possible to update the FreeBSD machines and force Jenkins to use gcc on FreeBSD?
Thanks,
Konrad
--------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.
2 years, 2 months
"Cannot allocate memory" error when creating VolumeStore
by Neil Shi
Dear Experts,
I'm setting up NVMf target with SPDK, and my platform has 4 Intel NVMe SSD attached, each with 1TB size. My first step is to create RAID and then volumestore. While I met "Cannot allocat memory" error when creating VolumeStore, error logs as attached.
I did 5 testing:
Test1: Each NVMe drive was made a RAID. In this case, I can create 3 volumestore successfully. The fourth will be failed.
Test2: Make NVMe1 and NVMe2 as raid1, and NVMe3 as raid2, NVMe4 as raid3, In this case, I can create 2 volumestore sucessfully. Then 3rd will be failed.
Test3: Make NVMe1 and NVMe2 as raid1, and NVMe3, NMVe4 as raid2, In this case, I can create 1 volumestore sucessfully. The 2nd will be failed.
Test4: Make NVMe1 NVMe2 and NVMe4 as raid1, and NMVe3 as raid2, In this case, I can create 1 volumestore sucessfully. The 2nd will be failed.
Test4: Make all 4 drives as one raid, In this case, no volumestore can be created.
So I'm very confused what's the restriction of the volumestore number?
Thanks.
Neil
2 years, 2 months
DPDK's new memory management module
by Crane Chu
Hi,
DPDK introduced new/dynamic memory management since its release 1805, and
now SPDK also enabled it by default. However, my application is not ready
for it, while the legacy mode just still works.
So, can we have one more option in *spdk_env_opts *to enable "*--legacy-mem*"
for *rte_eal_init*()?
Thanks,
2 years, 2 months