From: Harris, James R [mailto:james.r.harris@intel.com]
Sent: Friday, October 06, 2017 2:32 PM
To: Storage Performance Development Kit <spdk@lists.01.org>
Cc: Victor Banh <victorb@mellanox.com>
Subject: Re: [SPDK] Buffer I/O error on bigger block size running fio

 

(cc Victor)

 

From: James Harris <james.r.harris@intel.com>
Date: Thursday, October 5, 2017 at 1:59 PM
To: Storage Performance Development Kit <spdk@lists.01.org>
Subject: Re: [SPDK] Buffer I/O error on bigger block size running fio

 

Hi Victor,

 

Could you provide a few more details?  This will help the list to provide some ideas.

 

1)     On the client, are you using the SPDK NVMe-oF initiator or the kernel initiator?

 

Kernel initiator, run these commands on client server.

 

modprobe mlx5_ib

modprobe nvme-rdma

nvme discover -t rdma -a 192.168.10.11 -s 4420

nvme connect -t rdma -n nqn.2016-06.io.spdk:nvme-subsystem-1  -a 192.168.10.11 -s 4420

 

 

2)     Can you provide the fio configuration file or command line?  Just so we can have more specifics on “bigger block size”.

 

fio --bs=512k --numjobs=4 --iodepth=16 --loops=1 --ioengine=libaio --direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap --time_based --runtime=60 --filename=/dev/nvme1n1  --name=read-phase --rw=randwrite

 

3)     Any details on the HW setup – specifically details on the RDMA NIC (or if you’re using SW RoCE).

 

Nvmf.conf on target server

 

[Global]

  Comment "Global section"

    ReactorMask 0xff00

 

[Rpc]

  Enable No

  Listen 127.0.0.1

 

[Nvmf]

  MaxQueuesPerSession 8

  MaxQueueDepth 128

 

[Subsystem1]

  NQN nqn.2016-06.io.spdk:nvme-subsystem-1

  Core 9

  Mode Direct

  Listen RDMA 192.168.10.11:4420

  NVMe 0000:82:00.0

  SN S2PMNAAH400039

 

 

           It is RDMA NIC, ConnectX 5, Intel CPU Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz

NUMA node0 CPU(s):     0-7

NUMA node1 CPU(s):     8-15

 

 

 

 

Thanks,

 

-Jim

 

 

From: SPDK <spdk-bounces@lists.01.org> on behalf of Victor Banh <victorb@mellanox.com>
Reply-To: Storage Performance Development Kit <spdk@lists.01.org>
Date: Thursday, October 5, 2017 at 11:26 AM
To: "spdk@lists.01.org" <spdk@lists.01.org>
Subject: [SPDK] Buffer I/O error on bigger block size running fio

 

Hi

I have SPDK NVMeoF and keep getting error with bigger block size with fio on randwrite tests.

I am using Ubuntu 16.04 with kernel version 4.12.0-041200-generic on target and client.

The DPDK is 17.08 and SPDK is 17.07.1.

Thanks

Victor

 

 

[46905.233553] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750

[48285.159186] blk_update_request: I/O error, dev nvme1n1, sector 2507351968

[48285.159207] blk_update_request: I/O error, dev nvme1n1, sector 1301294496

[48285.159226] blk_update_request: I/O error, dev nvme1n1, sector 1947371168

[48285.159239] blk_update_request: I/O error, dev nvme1n1, sector 1891797568

[48285.159252] blk_update_request: I/O error, dev nvme1n1, sector 10833824

[48285.159265] blk_update_request: I/O error, dev nvme1n1, sector 614937152

[48285.159277] blk_update_request: I/O error, dev nvme1n1, sector 1872305088

[48285.159290] blk_update_request: I/O error, dev nvme1n1, sector 1504491040

[48285.159299] blk_update_request: I/O error, dev nvme1n1, sector 1182136128

[48285.159308] blk_update_request: I/O error, dev nvme1n1, sector 1662985792

[48285.191185] nvme nvme1: Reconnecting in 10 seconds...

[48285.191254] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191291] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191305] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191314] ldm_validate_partition_table(): Disk read failed.

[48285.191320] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191327] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191335] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191342] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191347] Dev nvme1n1: unable to read RDB block 0

[48285.191353] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191360] Buffer I/O error on dev nvme1n1, logical block 0, async page read

[48285.191375] Buffer I/O error on dev nvme1n1, logical block 3, async page read

[48285.191389]  nvme1n1: unable to read partition table

[48285.223197] nvme1n1: detected capacity change from 1600321314816 to 0

[48289.623192] nvme1n1: detected capacity change from 0 to -65647705833078784

[48289.623411] ldm_validate_partition_table(): Disk read failed.

[48289.623447] Dev nvme1n1: unable to read RDB block 0

[48289.623486]  nvme1n1: unable to read partition table

[48289.643305] ldm_validate_partition_table(): Disk read failed.

[48289.643328] Dev nvme1n1: unable to read RDB block 0

[48289.643373]  nvme1n1: unable to read partition table