[RFC PATCH v2 0/7] xfs: reflink & dedupe for fsdax (read/write path).
by Shiyang Ruan
This patchset aims to take care of this issue to make reflink and dedupe
work correctly (actually in read/write path, there still has some problems,
such as the page->mapping and page->index issue, in mmap path) in XFS under
fsdax mode.
It is based on Goldwyn's patchsets: "v4 Btrfs dax support" and the latest
iomap. I borrowed some patches related and made a few fix to make it
basically works fine.
For dax framework:
1. adapt to the latest change in iomap (two iomaps).
For XFS:
1. distinguish dax write/zero from normal write/zero.
2. remap extents after COW.
3. add file contents comparison function based on dax framework.
4. use xfs_break_layouts() instead of break_layout to support dax.
Goldwyn Rodrigues (3):
dax: replace mmap entry in case of CoW
fs: dedup file range to use a compare function
dax: memcpy before zeroing range
Shiyang Ruan (4):
dax: Introduce dax_copy_edges() for COW.
dax: copy data before write.
xfs: handle copy-on-write in fsdax write() path.
xfs: support dedupe for fsdax.
fs/btrfs/ioctl.c | 3 +-
fs/dax.c | 211 +++++++++++++++++++++++++++++++++++++----
fs/iomap/buffered-io.c | 8 +-
fs/ocfs2/file.c | 2 +-
fs/read_write.c | 11 ++-
fs/xfs/xfs_bmap_util.c | 6 +-
fs/xfs/xfs_file.c | 10 +-
fs/xfs/xfs_iomap.c | 3 +-
fs/xfs/xfs_iops.c | 11 ++-
fs/xfs/xfs_reflink.c | 79 ++++++++-------
include/linux/dax.h | 16 ++--
include/linux/fs.h | 9 +-
12 files changed, 291 insertions(+), 78 deletions(-)
--
2.23.0
2 years, 9 months
RE:faktura 024209
by Zlata Adamovska
---
Dobry den,
Vase fakturuv priloze,
S pozdravem.
2 years, 9 months
[PATCH v3] mm: Cleanup __put_devmap_managed_page() vs ->page_free()
by Dan Williams
After the removal of the device-public infrastructure there are only 2
->page_free() call backs in the kernel. One of those is a device-private
callback in the nouveau driver, the other is a generic wakeup needed in
the DAX case. In the hopes that all ->page_free() callbacks can be
migrated to common core kernel functionality, move the device-private
specific actions in __put_devmap_managed_page() under the
is_device_private_page() conditional, including the ->page_free()
callback. For the other page types just open-code the generic wakeup.
Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it
does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA
case.
Cc: Jan Kara <jack(a)suse.cz>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Jérôme Glisse <jglisse(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Changes since v2:
- Drop 'else' after return. (Christoph)
drivers/nvdimm/pmem.c | 6 ----
mm/memremap.c | 80 +++++++++++++++++++++++++++----------------------
2 files changed, 44 insertions(+), 42 deletions(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f9f76f6ba07b..21db1ce8c0ae 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem)
put_disk(pmem->disk);
}
-static void pmem_pagemap_page_free(struct page *page)
-{
- wake_up_var(&page->_refcount);
-}
-
static const struct dev_pagemap_ops fsdax_pagemap_ops = {
- .page_free = pmem_pagemap_page_free,
.kill = pmem_pagemap_kill,
.cleanup = pmem_pagemap_cleanup,
};
diff --git a/mm/memremap.c b/mm/memremap.c
index 022e78e68ea0..e1678e575d9f 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -27,7 +27,8 @@ static void devmap_managed_enable_put(void)
static int devmap_managed_enable_get(struct dev_pagemap *pgmap)
{
- if (!pgmap->ops || !pgmap->ops->page_free) {
+ if (pgmap->type == MEMORY_DEVICE_PRIVATE &&
+ (!pgmap->ops || !pgmap->ops->page_free)) {
WARN(1, "Missing page_free method\n");
return -EINVAL;
}
@@ -444,44 +445,51 @@ void __put_devmap_managed_page(struct page *page)
{
int count = page_ref_dec_return(page);
- /*
- * If refcount is 1 then page is freed and refcount is stable as nobody
- * holds a reference on the page.
- */
- if (count == 1) {
- /* Clear Active bit in case of parallel mark_page_accessed */
- __ClearPageActive(page);
- __ClearPageWaiters(page);
+ /* still busy */
+ if (count > 1)
+ return;
- mem_cgroup_uncharge(page);
+ /* only triggered by the dev_pagemap shutdown path */
+ if (count == 0) {
+ __put_page(page);
+ return;
+ }
- /*
- * When a device_private page is freed, the page->mapping field
- * may still contain a (stale) mapping value. For example, the
- * lower bits of page->mapping may still identify the page as
- * an anonymous page. Ultimately, this entire field is just
- * stale and wrong, and it will cause errors if not cleared.
- * One example is:
- *
- * migrate_vma_pages()
- * migrate_vma_insert_page()
- * page_add_new_anon_rmap()
- * __page_set_anon_rmap()
- * ...checks page->mapping, via PageAnon(page) call,
- * and incorrectly concludes that the page is an
- * anonymous page. Therefore, it incorrectly,
- * silently fails to set up the new anon rmap.
- *
- * For other types of ZONE_DEVICE pages, migration is either
- * handled differently or not done at all, so there is no need
- * to clear page->mapping.
- */
- if (is_device_private_page(page))
- page->mapping = NULL;
+ /* notify page idle for dax */
+ if (!is_device_private_page(page)) {
+ wake_up_var(&page->_refcount);
+ return;
+ }
- page->pgmap->ops->page_free(page);
- } else if (!count)
- __put_page(page);
+ /* Clear Active bit in case of parallel mark_page_accessed */
+ __ClearPageActive(page);
+ __ClearPageWaiters(page);
+
+ mem_cgroup_uncharge(page);
+
+ /*
+ * When a device_private page is freed, the page->mapping field
+ * may still contain a (stale) mapping value. For example, the
+ * lower bits of page->mapping may still identify the page as an
+ * anonymous page. Ultimately, this entire field is just stale
+ * and wrong, and it will cause errors if not cleared. One
+ * example is:
+ *
+ * migrate_vma_pages()
+ * migrate_vma_insert_page()
+ * page_add_new_anon_rmap()
+ * __page_set_anon_rmap()
+ * ...checks page->mapping, via PageAnon(page) call,
+ * and incorrectly concludes that the page is an
+ * anonymous page. Therefore, it incorrectly,
+ * silently fails to set up the new anon rmap.
+ *
+ * For other types of ZONE_DEVICE pages, migration is either
+ * handled differently or not done at all, so there is no need
+ * to clear page->mapping.
+ */
+ page->mapping = NULL;
+ page->pgmap->ops->page_free(page);
}
EXPORT_SYMBOL(__put_devmap_managed_page);
#endif /* CONFIG_DEV_PAGEMAP_OPS */
2 years, 9 months
[PATCH v2] mm: Cleanup __put_devmap_managed_page() vs ->page_free()
by Dan Williams
After the removal of the device-public infrastructure there are only 2
->page_free() call backs in the kernel. One of those is a device-private
callback in the nouveau driver, the other is a generic wakeup needed in
the DAX case. In the hopes that all ->page_free() callbacks can be
migrated to common core kernel functionality, move the device-private
specific actions in __put_devmap_managed_page() under the
is_device_private_page() conditional, including the ->page_free()
callback. For the other page types just open-code the generic wakeup.
Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it
does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA
case.
Cc: Jan Kara <jack(a)suse.cz>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Jérôme Glisse <jglisse(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Changes in v2:
- Stop requiring pgmap->ops for fsdax (Christoph)
- Clean up the indenting and organization in
__put_devmap_managed_page(). (Christoph)
drivers/nvdimm/pmem.c | 6 ----
mm/memremap.c | 77 ++++++++++++++++++++++++++-----------------------
2 files changed, 41 insertions(+), 42 deletions(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f9f76f6ba07b..21db1ce8c0ae 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem)
put_disk(pmem->disk);
}
-static void pmem_pagemap_page_free(struct page *page)
-{
- wake_up_var(&page->_refcount);
-}
-
static const struct dev_pagemap_ops fsdax_pagemap_ops = {
- .page_free = pmem_pagemap_page_free,
.kill = pmem_pagemap_kill,
.cleanup = pmem_pagemap_cleanup,
};
diff --git a/mm/memremap.c b/mm/memremap.c
index 022e78e68ea0..b52dc566efd2 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -27,7 +27,8 @@ static void devmap_managed_enable_put(void)
static int devmap_managed_enable_get(struct dev_pagemap *pgmap)
{
- if (!pgmap->ops || !pgmap->ops->page_free) {
+ if (pgmap->type == MEMORY_DEVICE_PRIVATE &&
+ (!pgmap->ops || !pgmap->ops->page_free)) {
WARN(1, "Missing page_free method\n");
return -EINVAL;
}
@@ -444,44 +445,48 @@ void __put_devmap_managed_page(struct page *page)
{
int count = page_ref_dec_return(page);
- /*
- * If refcount is 1 then page is freed and refcount is stable as nobody
- * holds a reference on the page.
- */
- if (count == 1) {
- /* Clear Active bit in case of parallel mark_page_accessed */
- __ClearPageActive(page);
- __ClearPageWaiters(page);
+ if (count > 1) {
+ /* still busy */
+ return;
+ } else if (count == 0) {
+ /* only triggered by the dev_pagemap shutdown path */
+ __put_page(page);
+ return;
+ } else if (!is_device_private_page(page)) {
+ /* notify page idle for dax */
+ wake_up_var(&page->_refcount);
+ return;
+ }
- mem_cgroup_uncharge(page);
+ /* Clear Active bit in case of parallel mark_page_accessed */
+ __ClearPageActive(page);
+ __ClearPageWaiters(page);
- /*
- * When a device_private page is freed, the page->mapping field
- * may still contain a (stale) mapping value. For example, the
- * lower bits of page->mapping may still identify the page as
- * an anonymous page. Ultimately, this entire field is just
- * stale and wrong, and it will cause errors if not cleared.
- * One example is:
- *
- * migrate_vma_pages()
- * migrate_vma_insert_page()
- * page_add_new_anon_rmap()
- * __page_set_anon_rmap()
- * ...checks page->mapping, via PageAnon(page) call,
- * and incorrectly concludes that the page is an
- * anonymous page. Therefore, it incorrectly,
- * silently fails to set up the new anon rmap.
- *
- * For other types of ZONE_DEVICE pages, migration is either
- * handled differently or not done at all, so there is no need
- * to clear page->mapping.
- */
- if (is_device_private_page(page))
- page->mapping = NULL;
+ mem_cgroup_uncharge(page);
- page->pgmap->ops->page_free(page);
- } else if (!count)
- __put_page(page);
+ /*
+ * When a device_private page is freed, the page->mapping field
+ * may still contain a (stale) mapping value. For example, the
+ * lower bits of page->mapping may still identify the page as an
+ * anonymous page. Ultimately, this entire field is just stale
+ * and wrong, and it will cause errors if not cleared. One
+ * example is:
+ *
+ * migrate_vma_pages()
+ * migrate_vma_insert_page()
+ * page_add_new_anon_rmap()
+ * __page_set_anon_rmap()
+ * ...checks page->mapping, via PageAnon(page) call,
+ * and incorrectly concludes that the page is an
+ * anonymous page. Therefore, it incorrectly,
+ * silently fails to set up the new anon rmap.
+ *
+ * For other types of ZONE_DEVICE pages, migration is either
+ * handled differently or not done at all, so there is no need
+ * to clear page->mapping.
+ */
+ page->mapping = NULL;
+ page->pgmap->ops->page_free(page);
}
EXPORT_SYMBOL(__put_devmap_managed_page);
#endif /* CONFIG_DEV_PAGEMAP_OPS */
2 years, 9 months
RE:
by SGV INVESTMENT
Did you receive our business proposal email ?
2 years, 9 months
答 复:pb0hau4t新形势下区域市场开发与经销商管理
by 钟主任
转 发......
发件人: "钟主任";<bhtn(a)vsujrgyrc.com>
发送时间: 2019-11-14/ 19:13:38
收件人: "linux-nvdimm"<linux-nvdimm(a)lists.01.org>
2 years, 9 months
[PATCH] mm: Cleanup __put_devmap_managed_page() vs ->page_free()
by Dan Williams
After the removal of the device-public infrastructure there are only 2
->page_free() call backs in the kernel. One of those is a device-private
callback in the nouveau driver, the other is a generic wakeup needed in
the DAX case. In the hopes that all ->page_free() callbacks can be
migrated to common core kernel functionality, move the device-private
specific actions in __put_devmap_managed_page() under the
is_device_private_page() conditional, including the ->page_free()
callback. For the other page types just open-code the generic wakeup.
Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it
does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA
case.
Cc: Jan Kara <jack(a)suse.cz>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jérôme Glisse <jglisse(a)redhat.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Hi John,
This applies on top of today's linux-next and passes my nvdimm unit
tests. That testing noticed that devmap_managed_enable_get() needed a
small fixup as well.
drivers/nvdimm/pmem.c | 6 ------
mm/memremap.c | 22 ++++++++++++----------
2 files changed, 12 insertions(+), 16 deletions(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f9f76f6ba07b..21db1ce8c0ae 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem)
put_disk(pmem->disk);
}
-static void pmem_pagemap_page_free(struct page *page)
-{
- wake_up_var(&page->_refcount);
-}
-
static const struct dev_pagemap_ops fsdax_pagemap_ops = {
- .page_free = pmem_pagemap_page_free,
.kill = pmem_pagemap_kill,
.cleanup = pmem_pagemap_cleanup,
};
diff --git a/mm/memremap.c b/mm/memremap.c
index 022e78e68ea0..6e6f3d6fdb73 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -27,7 +27,8 @@ static void devmap_managed_enable_put(void)
static int devmap_managed_enable_get(struct dev_pagemap *pgmap)
{
- if (!pgmap->ops || !pgmap->ops->page_free) {
+ if (!pgmap->ops || (pgmap->type == MEMORY_DEVICE_PRIVATE
+ && !pgmap->ops->page_free)) {
WARN(1, "Missing page_free method\n");
return -EINVAL;
}
@@ -449,12 +450,6 @@ void __put_devmap_managed_page(struct page *page)
* holds a reference on the page.
*/
if (count == 1) {
- /* Clear Active bit in case of parallel mark_page_accessed */
- __ClearPageActive(page);
- __ClearPageWaiters(page);
-
- mem_cgroup_uncharge(page);
-
/*
* When a device_private page is freed, the page->mapping field
* may still contain a (stale) mapping value. For example, the
@@ -476,10 +471,17 @@ void __put_devmap_managed_page(struct page *page)
* handled differently or not done at all, so there is no need
* to clear page->mapping.
*/
- if (is_device_private_page(page))
- page->mapping = NULL;
+ if (is_device_private_page(page)) {
+ /* Clear Active bit in case of parallel mark_page_accessed */
+ __ClearPageActive(page);
+ __ClearPageWaiters(page);
- page->pgmap->ops->page_free(page);
+ mem_cgroup_uncharge(page);
+
+ page->mapping = NULL;
+ page->pgmap->ops->page_free(page);
+ } else
+ wake_up_var(&page->_refcount);
} else if (!count)
__put_page(page);
}
2 years, 9 months
Re: DAX filesystem support on ARMv8
by Dan Williams
On Mon, Nov 11, 2019 at 6:12 PM Bharat Kumar Gogada <bharatku(a)xilinx.com> wrote:
>
> Hi All,
>
> As per Documentation/filesystems/dax.txt
>
> The DAX code does not work correctly on architectures which have virtually
> mapped caches such as ARM, MIPS and SPARC.
>
> Can anyone please shed light on dax filesystem issue w.r.t ARM architecture ?
The concern is VIVT caches since the kernel will want to flush pmem
addresses with different virtual addresses than what userspace is
using. As far as I know, ARMv8 has VIPT caches, so should not have an
issue. Willy initially wrote those restrictions, but I am assuming
that the concern was managing the caches in the presence of virtual
aliases.
2 years, 9 months
[PATCH] tools/testing/nvdimm: Fix mock support for ioremap
by Dan Williams
After commit d092a8707326 "arch: rely on asm-generic/io.h for default
ioremap_* definitions" the ioremap_nocache() symbol has been replaced
with ioremap(). Update the mocked symbol list for nvdimm testing.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Noticed this while trying the nvdimm tests on latest linux-next.
tools/testing/nvdimm/Kbuild | 1 +
tools/testing/nvdimm/test/iomap.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index c4a9196d794c..6aca8d5be159 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -5,6 +5,7 @@ ldflags-y += --wrap=devm_ioremap_nocache
ldflags-y += --wrap=devm_memremap
ldflags-y += --wrap=devm_memunmap
ldflags-y += --wrap=ioremap_nocache
+ldflags-y += --wrap=ioremap
ldflags-y += --wrap=iounmap
ldflags-y += --wrap=memunmap
ldflags-y += --wrap=__devm_request_region
diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c
index 3f55f2f99112..6271ac757a4b 100644
--- a/tools/testing/nvdimm/test/iomap.c
+++ b/tools/testing/nvdimm/test/iomap.c
@@ -193,6 +193,12 @@ void __iomem *__wrap_ioremap_nocache(resource_size_t offset, unsigned long size)
}
EXPORT_SYMBOL(__wrap_ioremap_nocache);
+void __iomem *__wrap_ioremap(resource_size_t offset, unsigned long size)
+{
+ return __nfit_test_ioremap(offset, size, ioremap);
+}
+EXPORT_SYMBOL(__wrap_ioremap);
+
void __iomem *__wrap_ioremap_wc(resource_size_t offset, unsigned long size)
{
return __nfit_test_ioremap(offset, size, ioremap_wc);
2 years, 9 months