Bdev write order
by Leo Li
Hi,
I'm new to SPDK and have a few questions. I'm trying to use bdev on top of
an NVMe device and my program may issue multiple write requests. Is the
order of write requests guaranteed by SPDK? For example, assume the
function calls
spdk_bdev_write_blocks(channel1, block1 ....)
spdk_bdev_write_blocks(channel1, block2 ....)
spdk_bdev_write_blocks(channel1, block3 ....)
from the same thread.
Are those write requests finished in order block1, block2, and block3?
Leo
1 year, 11 months
Re: [SPDK] A question about Command ID
by Oscar.Huang@microchip.com
Thanks, Benjamin. I'll try to add a new API and try it in our environment first. If it works well, I'll share it with the community.
Jim's idea regarding CID is good for my use case. I'll try it as well.
Thanks
-Oscar
On Wed, 2019-04-10 at 01:24 +0000, Oscar.Huang at microchip.com wrote:
> Hi, Jim
>
> Thank you for your prompt response. I understand the design philosophy about
> hiding low level complexity from users. I'm trying to use SPDK to test SSD
> firmware, which might not be an intended use case of SPDK. Sometimes I need to
> know the exact bit pattern of the commands enqueued to SQ for verification. I
> may also try to create mal-formatted command on purpose to challenge my
> firmware.
> The two "raw" APIs: spdk_nvme_ctrlr_cmd_admin_raw and
> spdk_nvme_ctrlr_cmd_io_raw should have 1:1 mapping the calls to them and the
> actual commands sent to the device, right? It'll be great if they can be
> extended this way:
>
>
> 1. After the SPDK assigns a unique CID and populates the PRP/SGL fields
> for the command, populates , copy the complete command back to the original
> command buffer the caller provided. Thus the caller can know the CID assigned,
> and what PRP/SGL fields are like.
>
> 2. It is even better if SPDK let me manipulate PRP/SGL fields directly.
> The caller does this on its own risk. As you pointed out, the caller needs to
> observe MDTS/NOIOB/PRP restrictions by any means on itself.
>
> I know this is not the normal use case of SPDK. But if you can share any
> comments or suggestions, it's much appreciated.
While using SPDK as a mechanism to test NVMe drives wasn't our initial intention
with SPDK, I think using a user space driver for testing is a fantastic solution
that I hope most device manufacturers adopt. If you were interested in adding a
new API that allows you to submit a raw command including the PRP/SGL, that
would certainly be considered for acceptance.
The CID problem is a bit harder. I spoke with Jim a bit and his most recent idea
was to add a function that attempted to look up a request by CID, so that when
we give you the CID in the timeout/abort callback, you can try to get to the
request. If the original request was split at all, this would fail, which I
think is fine for your use case. What do you think?
1 year, 11 months
Re: [SPDK] A question about Command ID
by Oscar.Huang@microchip.com
Hi, Jim
Thank you for your prompt response. I understand the design philosophy about hiding low level complexity from users. I'm trying to use SPDK to test SSD firmware, which might not be an intended use case of SPDK. Sometimes I need to know the exact bit pattern of the commands enqueued to SQ for verification. I may also try to create mal-formatted command on purpose to challenge my firmware.
The two "raw" APIs: spdk_nvme_ctrlr_cmd_admin_raw and spdk_nvme_ctrlr_cmd_io_raw should have 1:1 mapping the calls to them and the actual commands sent to the device, right? It'll be great if they can be extended this way:
1. After the SPDK assigns a unique CID and populates the PRP/SGL fields for the command, populates , copy the complete command back to the original command buffer the caller provided. Thus the caller can know the CID assigned, and what PRP/SGL fields are like.
2. It is even better if SPDK let me manipulate PRP/SGL fields directly. The caller does this on its own risk. As you pointed out, the caller needs to observe MDTS/NOIOB/PRP restrictions by any means on itself.
I know this is not the normal use case of SPDK. But if you can share any comments or suggestions, it's much appreciated.
Thanks
-Oscar
1 year, 11 months
fw update on multiple controllers
by Nabarro, Tom
Hello
Should it be possible to update firmware on multiple controllers on the same host in parallel through SPDK?
Best regards,
Tom Nabarro BEng (hons) MIET
Intel Corporation
HPDD Software Engineer
E: tom.nabarro(a)intel.com<mailto:tom.nabarro@intel.com>
M: +44 (0)7786 260986
Skype: tom.nabarro
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
1 year, 11 months
Blobstore threading
by Michael Haeuptle
Hello,
I've been experimenting with the blobstore and I'm not 100% sure I
understand the threading model.
Does the same blobstore support IO submissions from multiple threads?
For example, I've created a simple app that registers a poller on 4
reactors that each creates a blob, writes to it in 4KB chunks and once the
cluster size has been reached the blob is closed and deleted. After the
delete, the sequence repeats. This is happening in each of the pollers that
runs on a reactor.
I'm using the same blobstore for the above sequence and a Malloc backend
device.
The app sometimes cores immediately or after a couple of minutes. There
seems to be a race somewhere (not ruling out code I added).
So I was wondering if the the above is a supported scenario or if the
blobstore only works on a single core.
Ultimately, I want to create and read blobs via a socket interface and
support a large number of parallel requests from multiple clients. For blob
reads, I was thinking to simply let the blobstore read call back send the
chunk back over the socket to the client.
If the blobstore can only run on one core then I probably would have to
submit events to the blobstore thread from the socket threads or something
along those lines which could impact scalability.
Thanks.
-- Michael
1 year, 11 months
Re: [SPDK] A question about Command ID
by Harris, James R
On 4/9/19, 9:57 AM, "SPDK on behalf of Oscar.Huang(a)microchip.com" <spdk-bounces(a)lists.01.org on behalf of Oscar.Huang(a)microchip.com> wrote:
Hi,
I have a question about how to get the command ID of the NVMe command I submitted via APIs spdk_nvme_ctrlr_cmd_xxx, spdk_nvme_ns_cmd_xxx, spdk_nvme_ctrlr_cmd_admin_raw or spdk_nvme_ctrlr_cmd_io_raw. The APIs don't let me specify the command Identifier, instead, the Command Identify is set internally by SPDK. And it seems no way to know the Command Identified assigned by SPDK. Some APIs need a cid as input, for example, spdk_nvme_ctrlr_cmd_abort, if I don't know the cid of the command previously submitted, how could I try to abort it? Another use case is timeout callback, spdk_nvme_timeout_cb, when called, a cid will be passed in. I'd like to exactly what command timed out from the cid. How could I associate the abstract cid with any previously submitted command in it if I don't know the cid of the commands I submitted?
Hi Oscar,
As you've noticed, there is currently no way to specify nor retrieve the command ID (CID). This largely stems from the following:
* Command IDs must be unique. Allowing users to specify command IDs opens up a large surface of potential bugs.
* There is not always a 1-to-1 mapping between calls to the spdk_nvme APIs and actual commands sent to a namespace.
We made a conscious choice to hide a lot of complexity around MDTS, NOIOB and PRP from users of the SPDK NVMe driver, resulting in a number of cases where a user request may be split into multiple commands sent to the SSD. There is also the ability to queue requests inside of the NVMe driver when the submission queue is already full.
For spdk_nvme_ctrlr_cmd_abort(), you need the CID passed from the timeout callback. There is no other way to get a CID currently.
The SPDK NVMe driver could be extended to provide some additional insight into relationships between callback contexts and their CIDs. It would be predicated on the caller ensuring it does not violate any MDTS/NOIOB/PRP restrictions, nor overflowing the SQ. Is this something you'd be interested in tackling?
Thanks,
-Jim
Thanks
-Oscar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
1 year, 11 months
A question about Command ID
by Oscar.Huang@microchip.com
Hi,
I have a question about how to get the command ID of the NVMe command I submitted via APIs spdk_nvme_ctrlr_cmd_xxx, spdk_nvme_ns_cmd_xxx, spdk_nvme_ctrlr_cmd_admin_raw or spdk_nvme_ctrlr_cmd_io_raw. The APIs don't let me specify the command Identifier, instead, the Command Identify is set internally by SPDK. And it seems no way to know the Command Identified assigned by SPDK. Some APIs need a cid as input, for example, spdk_nvme_ctrlr_cmd_abort, if I don't know the cid of the command previously submitted, how could I try to abort it? Another use case is timeout callback, spdk_nvme_timeout_cb, when called, a cid will be passed in. I'd like to exactly what command timed out from the cid. How could I associate the abstract cid with any previously submitted command in it if I don't know the cid of the commands I submitted?
Thanks
-Oscar
1 year, 11 months
spdk_bdev_close() threading
by Stojaczyk, Dariusz
I was surprised to see that bdev descriptors can be closed only from the same thread that opened them. Vhost doesn't respect this rule. As expected, I was able to trigger assert(desc->thread == spdk_get_thread()) while closing a vhost scsi descriptor using the latest SPDK master. This could be probably fixed by always scheduling the spdk_bdev_close() to proper thread. Maybe vhost could even immediately assume the descriptor is closed and set its `desc` pointer to NULL without waiting for spdk_bdev_close() to be actually called. But why the descriptor needs to be closed from a specific thread in the first place? Would it be possible for spdk_bdev_close() to internally schedule itself on desc->thread?
A descriptor cannot be closed until all associated channel have been destroyed - that's what the bdev programming guide says. When there are multiple I/O channels, there has to be some multi-thread management involved. Also, those channels can't be closed until all their pending I/O has finished. So closing a descriptor will likely have the following flow:
external request (e.g. bdev hotremove or some RPC) -> start draining I/O on all threads -> destroy each i/o channel after its pending i/o has finished -> on the last thread to destroy a channel schedule closing the desc to a proper thread -> close the desc
This additional scheduling of spdk_bdev_close() looks completely unnecessary - it also forces the upper layer to maintain a pointer to the desc thread somewhere, because desc->thread is private to the bdev layer. So to repeat the question again -
would it be possible for spdk_bdev_close() to internally schedule itself on desc->thread, so that spdk_bdev_close() can be called from any thread?
D.
1 year, 11 months