On Sun, Oct 28, 2018 at 7:19 AM Vincent <cockroach1136(a)gmail.com> wrote:
Hello all,
Recently we are trying the disk hot remove property of SPDK.
We have a counter to record the IO send out for a io channel
The roughly hot remove procedure in my code is
(1) when receive the disk hot remove call back from spdk, we stop sending
IO
(2) Because we have a counter to record the IO send out for IO channel, we
wait all IOs call back(complete)
(3) close io channel
(4)close bdev desc
But sometime we crashed (the crash rate is about 1/10) , the call stack is
attached
crash in function nvme_free_request
void
nvme_free_request(struct nvme_request *req)
{
assert(req != NULL);
assert(req->num_children == 0);
assert(req->qpair != NULL);
STAILQ_INSERT_HEAD(&req->qpair->free_req, req, stailq);
<-------------this line
}
Does any one can give me a hint ??
It looks like nvme_pcie_qpair s not reference counted, and thus the nvme
completion path below does not account for the possibility that the user
callback fired by nvme_complete_request() can close I/O channel (which,
for nvme bdev, will destroy the underlying qpair) before freeing the
associated nvme request. This, if happens, will result in
nvme_free_request() being entered after the underlying qpair has been
destroyed, potentially crashing the app.
Regards,
Andrey
static void
nvme_pcie_qpair_complete_tracker(struct spdk_nvme_qpair *qpair, struct
nvme_tracker *tr,
struct spdk_nvme_cpl *cpl, bool print_on_error)
{
[snip]
if (retry) {
req->retries++;
nvme_pcie_qpair_submit_tracker(qpair, tr);
} else {
if (was_active) {
/* Only check admin requests from different processes. */
if (nvme_qpair_is_admin_queue(qpair) && req->pid != getpid()) {
req_from_current_proc = false;
nvme_pcie_qpair_insert_pending_admin_request(qpair, req, cpl);
} else {
nvme_complete_request(req, cpl);
}
}
if (req_from_current_proc == true) {
nvme_free_request(req);
}
Any suggestion is appreciated
Thank you in advance
--------------------------------------------------------------------------------------------------------------------------
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./smistor_iscsi_tgt -c
/usr/smistor/config/smistor_iscsi_perf.conf'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000414b87 in nvme_free_request (req=req@entry=0x7fe4edbf3100)
at nvme.c:227
227 nvme.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.170-4.el7.x86_64
elfutils-libs-0.170-4.el7.x86_64 glibc-2.17-222.el7.x86_64
libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64
libcap-2.22-9.el7.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64
libuuid-2.23.2-52.el7.x86_64 lz4-1.7.5-2.el7.x86_64
numactl-libs-2.0.9-7.el7.x86_64 openssl-libs-1.0.2k-12.el7.x86_64
systemd-libs-219-57.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x0000000000414b87 in nvme_free_request (req=req@entry=0x7fe4edbf3100)
at nvme.c:227
#1 0x0000000000412056 in nvme_pcie_qpair_complete_tracker
(qpair=qpair@entry=0x7fe4c5376ef8, tr=0x7fe4ca8ad000,
cpl=cpl@entry=0x7fe4c8e0a840, print_on_error=print_on_error@entry
=true)
at nvme_pcie.c:1170
#2 0x0000000000413be0 in nvme_pcie_qpair_process_completions
(qpair=qpair@entry=0x7fe4c5376ef8, max_completions=64,
max_completions@entry=0) at nvme_pcie.c:2013
#3 0x0000000000415d7b in nvme_transport_qpair_process_completions
(qpair=qpair@entry=0x7fe4c5376ef8,
max_completions=max_completions@entry=0) at nvme_transport.c:201
#4 0x000000000041449d in spdk_nvme_qpair_process_completions
(qpair=0x7fe4c5376ef8, max_completions=max_completions@entry=0)
at nvme_qpair.c:368
#5 0x000000000040a289 in bdev_nvme_poll (arg=0x7fe08c0012a0) at
bdev_nvme.c:208
#6 0x0000000000499baa in _spdk_reactor_run (arg=0x6081dc0) at
reactor.c:452
#7 0x00000000004a4284 in eal_thread_loop ()
#8 0x00007fe8fb276e25 in start_thread () from /lib64/libpthread.so.0
#9 0x00007fe8fafa0bad in clone () from /lib64/libc.so.6
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk