I'm assuming that if you ran this same test, but using the kernel NVMe-oF target, that
you also get a kernel panic?
Filing a GitHub issue is an interesting idea. This isn't something SPDK can fix, so
we would probably immediately close the issue, but then it would be saved for posterity.
Maybe we could create a new label type to indicate these types of issues.
Alternatively, we create some kind of other document in the SPDK repository to track and
record these types of issues.
I'd be curious to hear others' opinions on this.
-Jim
On 1/9/19, 3:51 PM, "SPDK on behalf of Shahar Salzman"
<spdk-bounces(a)lists.01.org on behalf of shahar.salzman(a)kaminario.com> wrote:
Should I open a github issue so that people are aware of this?
As it succeeds on CentOS 7.6 the issue itself is solved, so no real need to open a bug
for RedHat/linux kernel.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Howell, Seth
<seth.howell(a)intel.com>
Sent: Monday, January 7, 2019 5:11 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Kernel panic on redhat 7.5 host
Hi Shahar,
Thank you for bringing this up. We currently don't have any CentOS 7.5 machines in
our build pool, We have CentOS7.4 machines, and haven't observed this behavior.
Although we admittedly aren't doing a lot of per-patch NVMe-oF testing on those
machines. I will make time this week to reproduce this issue.
I will spin up CentOS 7.4 and 7.5 machines to repro this week.
Have you created a GitHub issue yet?
Thanks,
Seth
-----Original Message-----
From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of Shahar Salzman
Sent: Thursday, January 3, 2019 6:26 AM
To: Shahar Salzman <shahar.salzman(a)kaminario.com>; Storage Performance
Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Kernel panic on redhat 7.5 host
BTW, this does not happen on CentOS7.6
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Shahar Salzman
<shahar.salzman(a)kaminario.com>
Sent: Wednesday, January 2, 2019 9:40 PM
To: Storage Performance Development Kit
Subject: [SPDK] Kernel panic on redhat 7.5 host
Hi,
We have been playing around with SPDK access control, and it seems that when a RHEL
7.5 host attempts to connect to a subsystem and is denied access, it panics with the BT
bellow.
This doesn't happen in my newer kernel system, but we where hoping to qualify RHEL
7.5.
I am now going down the rabbit hole of debugging this kernel panic, I wonder if anyone
has encountered such kernel panics?
[714117.730194] BUG: unable to handle kernel NULL pointer dereference at
00000000000002b0 [714117.730556] IP: [<ffffffffc086a69d>] rdma_disconnect+0xd/0xa0
[rdma_cm] [714117.730827] PGD 0 [714117.731067] Oops: 0000 [#1] SMP [714117.731337]
Modules linked in: dm_queue_length nvme_rdma nvme_fabrics nvme_core nfsv3 nfs_acl
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert
iscsi_target_m ore_mod ib_srp scsi_transport_srp scsi_tgt ib_ucm rpcrdma rdma_ucm
ib_uverbs ib_iser ib_umad rdma_cm iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm
mlx5_ib ib_core vmw_vsock_vmci_transport sb_edac coretemp vmw_balloon iosf_mbi
crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd
mlx5_core joydev pcspkr mlxfw devlink ptp pps_core sg nfit parpo dimm shpchp vmw_vmci
i2c_piix4 dm_multipath dm_mod ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic
pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops [7141
17.73324
6] ttm ahci drm sd_mod crc_t10dif ata_piix libahci crct10dif_generic libata
crct10dif_pclmul crct10dif_common serio_raw crc32c_intel vmw_pvscsi vmxnet3 i2c_core
floppy [714117.734185] CPU: 1 PID: 14001 Comm: kworker/u16:3 Kdump: loaded Not tainted
3.10.0-862.el7.x86_64 #1 [714117.734661] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 [714117.735165] Workqueue:
nvme-wq nvme_rdma_reconnect_ctrl_work [nvme_rdma] [714117.735692] task: ffff9d383b1d4f10
ti: ffff9d39b5d30000 task.ti: ffff9d39b5d30000 [714117.736273] RIP:
0010:[<ffffffffc086a69d>] [<ffffffffc086a69d>] rdma_disconnect+0xd/0xa0
[rdma_cm] [714117.736872] RSP: 0018:ffff9d39b5d33dd0 EFLAGS: 00010246 [714117.737450]
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000200 [714117.738071] RDX:
0000000000000010 RSI: 86c0000000000000 RDI: 0000000000000000 [714117.738677] RBP:
ffff9d39b5d33dd8 R08: ffff9d39a924b0e0 R09: 8b18f8a6e5c4b0d8 [714117.739287] R1
0: 8b18f
8a6e5c4b0d8 R11: 000289c6869521c0 R12: ffff9d39a924b0d8 [714117.739911] R13:
ffff9d39a924b000 R14: ffff9d37b694df00 R15: 0000000000000200 [714117.740548] FS:
0000000000000000(0000) GS:ffff9d39bfc40000(0000) knlGS:0000000000000000 [714117.741196]
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [714117.741850] CR2: 00000000000002b0
CR3: 00000001f7026000 CR4: 00000000001607e0 [714117.742576] Call Trace:
[714117.743253] [<ffffffffc087bbe2>] nvme_rdma_stop_and_free_queue+0x22/0x40
[nvme_rdma] [714117.743956] [<ffffffffc087be84>]
nvme_rdma_reconnect_ctrl_work+0x54/0x1b0 [nvme_rdma] [714117.744676]
[<ffffffff900b2dff>] process_one_work+0x17f/0x440 [714117.745395]
[<ffffffff900b3ac6>] worker_thread+0x126/0x3c0 [714117.746120]
[<ffffffff900b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0
[714117.746856] [<ffffffff900bae31>] kthread+0xd1/0xe0 [714117.747600]
[<ffffffff900bad60>] ? insert_kthread_work+0x40/0x40 [714117.748352]
[<ffffffff9071f637>] ret_from_fork_nospec_begin+0x21/0x21
[714117.749112] [<ffffffff900bad60>] ? insert_kthread_work+0x40/0x40
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk