On 12/9/20 10:59 AM, Joao Martins wrote:
On 12/8/20 7:29 PM, Jason Gunthorpe wrote:
> On Tue, Dec 08, 2020 at 05:29:00PM +0000, Joao Martins wrote:
>
>> static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int
dirty)
>> {
>> + bool make_dirty = umem->writable && dirty;
>> + struct page **page_list = NULL;
>> struct sg_page_iter sg_iter;
>> + unsigned long nr = 0;
>> struct page *page;
>>
>> + page_list = (struct page **) __get_free_page(GFP_KERNEL);
>
> Gah, no, don't do it like this!
>
> Instead something like:
>
> for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i)
> unpin_use_pages_range_dirty_lock(sg_page(sg), sg->length/PAGE_SIZE,
> umem->writable && dirty);
>
> And have the mm implementation split the contiguous range of pages into
> pairs of (compound head, ntails) with a bit of maths.
>
Got it :)
I was trying to avoid another exported symbol.
Albeit upon your suggestion below, it doesn't justify the efficiency/clearness lost.
This more efficient suggestion of yours leads to a further speed up from:
1073 rounds in 5.004 sec: 4663.622 usec / round (hugetlbfs)
to
1370 rounds in 5.003 sec: 3651.562 usec / round (hugetlbfs)
Right after I come back from holidays I will follow up with this series in separate.
Joao