On Fri, Aug 04, 2017 at 11:01:08AM -0700, Dan Williams wrote:
[ adding Dave who is working on a blk-mq + dma offload version of
pmem driver ]
On Fri, Aug 4, 2017 at 1:17 AM, Minchan Kim <minchan(a)kernel.org> wrote:
> On Fri, Aug 04, 2017 at 12:54:41PM +0900, Minchan Kim wrote:
>> Thanks for the testing. Your testing number is within noise level?
>> I cannot understand why PMEM doesn't have enough gain while BTT is
>> win(8%). I guess no rw_page with BTT testing had more chances to wait bio
>> allocation and mine and rw_page testing reduced it significantly. However,
>> in no rw_page with pmem, there wasn't many cases to wait bio allocations
>> to the device is so fast so the number comes from purely the number of
>> instructions has done. At a quick glance of bio init/submit, it's not
>> so indeed, i understand where the 12% enhancement comes from but I'm not
>> it's really big difference in real practice at the cost of maintaince
> I tested pmbench 10 times in my local machine(4 core) with zram-swap.
> In my machine, even, on-stack bio is faster than rw_page. Unbelievable.
> I guess it's really hard to get stable result in severe memory pressure.
> It would be a result within noise level(see below stddev).
> So, I think it's hard to conclude rw_page is far faster than onstack-bio.
> avg 5.54us
> stddev 8.89%
> max 6.02us
> min 4.20us
> onstack bio
> avg 5.27us
> stddev 13.03%
> max 5.96us
> min 3.55us
The maintenance burden of having alternative submission paths is
significant especially as we consider the pmem driver ising more
services of the core block layer. Ideally, I'd want to complete the
rw_page removal work before we look at the blk-mq + dma offload
The change to introduce BDI_CAP_SYNC is interesting because we might
have use for switching between dma offload and cpu copy based on
whether the I/O is synchronous or otherwise hinted to be a low latency
request. Right now the dma offload patches are using "bio_segments() >
1" as the gate for selecting offload vs cpu copy which seem
Okay, so based on the feedback above and from Jens, it sounds like we want
to go forward with removing the rw_page() interface, and instead optimize the
regular I/O path via on-stack BIOS and dma offload, correct?
If so, I'll prepare patches that fully remove the rw_page() code, and let
Minchan and Dave work on their optimizations.