Hi, there!
I have a question regarding how read-write ordering is ensured in SPDK's Blobstore. As
I understand, the Blobstore interface offers a filesystem-like interface, which I think
would guarantee all writes are seen by subsequent reads, even though these reads are
submitted before the write completes.
However, when I go through the code I found Blobstore seems to directly send the requests
to the bdev layer, which asks the NVMe driver to issue the request to the SSD. AFAIK, the
NVMe controller doesn't guarantee command ordering (NVMe spec says, "if a Read is
submitted for LBA x and there is a Write also submitted for LBA x, there is no guarantee
of the order of completion for those commands, the Read may finish first or the Write may
finish first").
I wonder how Blobstore ensures all writes are seen by subsequent reads? If Blobstore
doesn't provide such guarantee, what is the usual way to ensure the ordering?
Thanks!
Regards,
JInhao Fan
University of Science and Technology of China
Show replies by date
Hi Jinhao,
POSIX filesystems do not provide guarantees on reads submitted before an overlapping write
has completed. If you want to guarantee the new data is returned, you must submit the read
operation after the write has completed.
The SPDK blobstore behaves similarly. If you submit a write for block X, and then submit a
read for block X before that write has completed, your read may return the old data or the
new data. If your application has ordering requirements, it must wait until the write is
completed before submitting the read operation.
Regards,
-Jim
On 2/9/21, 3:10 AM, "fandahao17(a)mail.ustc.edu.cn"
<fandahao17(a)mail.ustc.edu.cn> wrote:
Hi, there!
I have a question regarding how read-write ordering is ensured in SPDK's
Blobstore. As I understand, the Blobstore interface offers a filesystem-like interface,
which I think would guarantee all writes are seen by subsequent reads, even though these
reads are submitted before the write completes.
However, when I go through the code I found Blobstore seems to directly send the
requests to the bdev layer, which asks the NVMe driver to issue the request to the SSD.
AFAIK, the NVMe controller doesn't guarantee command ordering (NVMe spec says,
"if a Read is submitted for LBA x and there is a Write also submitted for LBA x,
there is no guarantee of the order of completion for those commands, the Read may finish
first or the Write may finish first").
I wonder how Blobstore ensures all writes are seen by subsequent reads? If Blobstore
doesn't provide such guarantee, what is the usual way to ensure the ordering?
Thanks!
Regards,
JInhao Fan
University of Science and Technology of China
_______________________________________________
SPDK mailing list -- spdk(a)lists.01.org
To unsubscribe send an email to spdk-leave(a)lists.01.org