[ndctl PATCH 0/2] ndctl: fix various valgrind issues
by Vishal Verma
The parent_uuid patches introduced a double free in the unit test.
In running valgrind, I also found a few other memory leak type issues.
These patches fix most of these - a few valgrind complaints still
remain - related to the kmod context not being freed:
==11898== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11898== by 0x5250830: kmod_new (in /usr/lib64/libkmod.so.2.2.4)
==11898== by 0x4E3A2B8: ndctl_new (libndctl.c:434)
==11898== by 0x40B776: test_libndctl (test-libndctl.c:1560)
...
And so on. I'm not sure why these are happening.
The patches apply on top of the parent_uuid patches.
Vishal Verma (2):
ndctl: fix a double free in test-parent-uuid.c
ndctl: fix various memory leaks reported by valgrind
lib/libndctl.c | 23 +++++++++++++++++++++--
lib/test-libndctl.c | 13 ++++++++-----
lib/test-parent-uuid.c | 42 +++++++++++++++++-------------------------
3 files changed, 46 insertions(+), 32 deletions(-)
--
2.4.3
6 years, 11 months
[GIT PULL] libnvdimm fix for 4.2
by Williams, Dan J
Hi Linus please pull from...
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes
...to receive a single fix for the nd_blk driver.
The effect of getting the width of this register read wrong is that all
I/O fails when the read returns non-zero. Given the availability of
ACPI 6 NFIT enabled platforms, this could reasonably wait to come in
during the 4.3 merge window with a tag for 4.2-stable. Otherwise, this
makes the 4.2 kernel fully functional with devices that conform to the
mmio-block-apertures defined in the ACPI 6 NFIT (NVDIMM Firmware
Interface Table).
Full changelog and diffstat below.
---
The following changes since commit cbfe8fa6cd672011c755c3cd85c9ffd4e2d10a6f:
Linux 4.2-rc4 (2015-07-26 12:26:21 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes
for you to fetch changes up to de4a196c02a2a2631b516d90da6e8d052ccb07e8:
nfit, nd_blk: BLK status register is only 32 bits (2015-08-25 19:42:01 -0400)
----------------------------------------------------------------
Ross Zwisler (1):
nfit, nd_blk: BLK status register is only 32 bits
drivers/acpi/nfit.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
---
commit de4a196c02a2a2631b516d90da6e8d052ccb07e8
Author: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Date: Thu Aug 20 16:27:38 2015 -0600
nfit, nd_blk: BLK status register is only 32 bits
Only read 32 bits for the BLK status register in read_blk_stat().
The format and size of this register is defined in the
"NVDIMM Driver Writer's guide":
http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Reported-by: Nicholas Moulin <nicholas.w.moulin(a)linux.intel.com>
Tested-by: Nicholas Moulin <nicholas.w.moulin(a)linux.intel.com>
Reviewed-by: Jeff Moyer <jmoyer(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 628a42c41ab1..bb29e56276bd 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1024,7 +1024,7 @@ static void wmb_blk(struct nfit_blk *nfit_blk)
wmb_pmem();
}
-static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
+static u32 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
{
struct nfit_blk_mmio *mmio = &nfit_blk->mmio[DCR];
u64 offset = nfit_blk->stat_offset + mmio->size * bw;
@@ -1032,7 +1032,7 @@ static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
if (mmio->num_lines)
offset = to_interleave_offset(offset, mmio);
- return readq(mmio->base + offset);
+ return readl(mmio->base + offset);
}
static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw,
6 years, 11 months
[PATCH] nfit, nd_blk: BLK status register is only 32 bits
by Ross Zwisler
Only read 32 bits for the BLK status register in read_blk_stat().
The format and size of this register is defined in the
"NVDIMM Driver Writer's guide":
http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Reported-by: Nicholas Moulin <nicholas.w.moulin(a)linux.intel.com>
---
drivers/acpi/nfit.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 7c2638f..8689ee1 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1009,7 +1009,7 @@ static void wmb_blk(struct nfit_blk *nfit_blk)
wmb_pmem();
}
-static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
+static u32 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
{
struct nfit_blk_mmio *mmio = &nfit_blk->mmio[DCR];
u64 offset = nfit_blk->stat_offset + mmio->size * bw;
@@ -1017,7 +1017,7 @@ static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
if (mmio->num_lines)
offset = to_interleave_offset(offset, mmio);
- return readq(mmio->base + offset);
+ return readl(mmio->base + offset);
}
static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw,
--
2.1.0
6 years, 11 months
[PATCH v2] nd_blk: add support for "read flush" DSM flag
by Ross Zwisler
Add support for the "read flush" _DSM flag, as outlined in the DSM spec:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read. This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM. We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.
By supporting the "read flush" flag we can also change the ND BLK
aperture mapping from write-combining to write-back via memremap().
In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range(). This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cacheflush.h | 2 ++
arch/x86/include/asm/io.h | 2 --
arch/x86/include/asm/pmem.h | 2 ++
drivers/acpi/Kconfig | 1 +
drivers/acpi/nfit.c | 55 ++++++++++++++++++++++-----------------
drivers/acpi/nfit.h | 16 ++++++++----
lib/Kconfig | 3 +++
tools/testing/nvdimm/Kbuild | 2 ++
tools/testing/nvdimm/test/iomap.c | 30 +++++++++++++++++++--
tools/testing/nvdimm/test/nfit.c | 10 ++++---
11 files changed, 88 insertions(+), 36 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 76c6115..03ab612 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -28,6 +28,7 @@ config X86
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_PMEM_API
+ select ARCH_HAS_MMIO_FLUSH
select ARCH_HAS_SG_CHAIN
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 471418a..e63aa38 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -89,6 +89,8 @@ int set_pages_rw(struct page *page, int numpages);
void clflush_cache_range(void *addr, unsigned int size);
+#define mmio_flush_range(addr, size) clflush_cache_range(addr, size)
+
#ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void);
extern const int rodata_test_data;
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index d241fbd..83ec9b1 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -248,8 +248,6 @@ static inline void flush_write_buffers(void)
#endif
}
-#define ARCH_MEMREMAP_PMEM MEMREMAP_WB
-
#endif /* __KERNEL__ */
extern void native_io_delay(void);
diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h
index a3a0df6..bb026c5 100644
--- a/arch/x86/include/asm/pmem.h
+++ b/arch/x86/include/asm/pmem.h
@@ -18,6 +18,8 @@
#include <asm/cpufeature.h>
#include <asm/special_insns.h>
+#define ARCH_MEMREMAP_PMEM MEMREMAP_WB
+
#ifdef CONFIG_ARCH_HAS_PMEM_API
/**
* arch_memcpy_to_pmem - copy data to persistent memory
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 114cf48..4baeb85 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -410,6 +410,7 @@ config ACPI_NFIT
tristate "ACPI NVDIMM Firmware Interface Table (NFIT)"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
+ depends on ARCH_HAS_MMIO_FLUSH
select LIBNVDIMM
help
Infrastructure to probe ACPI 6 compliant platforms for
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 7c2638f..56fff01 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1017,7 +1017,7 @@ static u64 read_blk_stat(struct nfit_blk *nfit_blk, unsigned int bw)
if (mmio->num_lines)
offset = to_interleave_offset(offset, mmio);
- return readq(mmio->base + offset);
+ return readq(mmio->addr.base + offset);
}
static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw,
@@ -1042,11 +1042,11 @@ static void write_blk_ctl(struct nfit_blk *nfit_blk, unsigned int bw,
if (mmio->num_lines)
offset = to_interleave_offset(offset, mmio);
- writeq(cmd, mmio->base + offset);
+ writeq(cmd, mmio->addr.base + offset);
wmb_blk(nfit_blk);
if (nfit_blk->dimm_flags & ND_BLK_DCR_LATCH)
- readq(mmio->base + offset);
+ readq(mmio->addr.base + offset);
}
static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
@@ -1078,11 +1078,16 @@ static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
}
if (rw)
- memcpy_to_pmem(mmio->aperture + offset,
+ memcpy_to_pmem(mmio->addr.aperture + offset,
iobuf + copied, c);
- else
+ else {
+ if (nfit_blk->dimm_flags & ND_BLK_READ_FLUSH)
+ mmio_flush_range((void __force *)
+ mmio->addr.aperture + offset, c);
+
memcpy_from_pmem(iobuf + copied,
- mmio->aperture + offset, c);
+ mmio->addr.aperture + offset, c);
+ }
copied += c;
len -= c;
@@ -1129,7 +1134,10 @@ static void nfit_spa_mapping_release(struct kref *kref)
WARN_ON(!mutex_is_locked(&acpi_desc->spa_map_mutex));
dev_dbg(acpi_desc->dev, "%s: SPA%d\n", __func__, spa->range_index);
- iounmap(spa_map->iomem);
+ if (spa_map->type == SPA_MAP_APERTURE)
+ memunmap((void __force *)spa_map->addr.aperture);
+ else
+ iounmap(spa_map->addr.base);
release_mem_region(spa->address, spa->length);
list_del(&spa_map->list);
kfree(spa_map);
@@ -1175,7 +1183,7 @@ static void __iomem *__nfit_spa_map(struct acpi_nfit_desc *acpi_desc,
spa_map = find_spa_mapping(acpi_desc, spa);
if (spa_map) {
kref_get(&spa_map->kref);
- return spa_map->iomem;
+ return spa_map->addr.base;
}
spa_map = kzalloc(sizeof(*spa_map), GFP_KERNEL);
@@ -1191,20 +1199,19 @@ static void __iomem *__nfit_spa_map(struct acpi_nfit_desc *acpi_desc,
if (!res)
goto err_mem;
- if (type == SPA_MAP_APERTURE) {
- /*
- * TODO: memremap_pmem() support, but that requires cache
- * flushing when the aperture is moved.
- */
- spa_map->iomem = ioremap_wc(start, n);
- } else
- spa_map->iomem = ioremap_nocache(start, n);
+ spa_map->type = type;
+ if (type == SPA_MAP_APERTURE)
+ spa_map->addr.aperture = (void __pmem *)memremap(start, n,
+ ARCH_MEMREMAP_PMEM);
+ else
+ spa_map->addr.base = ioremap_nocache(start, n);
+
- if (!spa_map->iomem)
+ if (!spa_map->addr.base)
goto err_map;
list_add_tail(&spa_map->list, &acpi_desc->spa_maps);
- return spa_map->iomem;
+ return spa_map->addr.base;
err_map:
release_mem_region(start, n);
@@ -1267,7 +1274,7 @@ static int acpi_nfit_blk_get_flags(struct nvdimm_bus_descriptor *nd_desc,
nfit_blk->dimm_flags = flags.flags;
else if (rc == -ENOTTY) {
/* fall back to a conservative default */
- nfit_blk->dimm_flags = ND_BLK_DCR_LATCH;
+ nfit_blk->dimm_flags = ND_BLK_DCR_LATCH | ND_BLK_READ_FLUSH;
rc = 0;
} else
rc = -ENXIO;
@@ -1307,9 +1314,9 @@ static int acpi_nfit_blk_region_enable(struct nvdimm_bus *nvdimm_bus,
/* map block aperture memory */
nfit_blk->bdw_offset = nfit_mem->bdw->offset;
mmio = &nfit_blk->mmio[BDW];
- mmio->base = nfit_spa_map(acpi_desc, nfit_mem->spa_bdw,
+ mmio->addr.base = nfit_spa_map(acpi_desc, nfit_mem->spa_bdw,
SPA_MAP_APERTURE);
- if (!mmio->base) {
+ if (!mmio->addr.base) {
dev_dbg(dev, "%s: %s failed to map bdw\n", __func__,
nvdimm_name(nvdimm));
return -ENOMEM;
@@ -1330,9 +1337,9 @@ static int acpi_nfit_blk_region_enable(struct nvdimm_bus *nvdimm_bus,
nfit_blk->cmd_offset = nfit_mem->dcr->command_offset;
nfit_blk->stat_offset = nfit_mem->dcr->status_offset;
mmio = &nfit_blk->mmio[DCR];
- mmio->base = nfit_spa_map(acpi_desc, nfit_mem->spa_dcr,
+ mmio->addr.base = nfit_spa_map(acpi_desc, nfit_mem->spa_dcr,
SPA_MAP_CONTROL);
- if (!mmio->base) {
+ if (!mmio->addr.base) {
dev_dbg(dev, "%s: %s failed to map dcr\n", __func__,
nvdimm_name(nvdimm));
return -ENOMEM;
@@ -1399,7 +1406,7 @@ static void acpi_nfit_blk_region_disable(struct nvdimm_bus *nvdimm_bus,
for (i = 0; i < 2; i++) {
struct nfit_blk_mmio *mmio = &nfit_blk->mmio[i];
- if (mmio->base)
+ if (mmio->addr.base)
nfit_spa_unmap(acpi_desc, mmio->spa);
}
nd_blk_region_set_provider_data(ndbr, NULL);
diff --git a/drivers/acpi/nfit.h b/drivers/acpi/nfit.h
index f2c2bb7..7e74015 100644
--- a/drivers/acpi/nfit.h
+++ b/drivers/acpi/nfit.h
@@ -41,6 +41,7 @@ enum nfit_uuids {
};
enum {
+ ND_BLK_READ_FLUSH = 1,
ND_BLK_DCR_LATCH = 2,
};
@@ -117,12 +118,16 @@ enum nd_blk_mmio_selector {
DCR,
};
+struct nd_blk_addr {
+ union {
+ void __iomem *base;
+ void __pmem *aperture;
+ };
+};
+
struct nfit_blk {
struct nfit_blk_mmio {
- union {
- void __iomem *base;
- void __pmem *aperture;
- };
+ struct nd_blk_addr addr;
u64 size;
u64 base_offset;
u32 line_size;
@@ -149,7 +154,8 @@ struct nfit_spa_mapping {
struct acpi_nfit_system_address *spa;
struct list_head list;
struct kref kref;
- void __iomem *iomem;
+ enum spa_map_type type;
+ struct nd_blk_addr addr;
};
static inline struct nfit_spa_mapping *to_spa_map(struct kref *kref)
diff --git a/lib/Kconfig b/lib/Kconfig
index 3a2ef67..a938a39 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -531,4 +531,7 @@ config ARCH_HAS_SG_CHAIN
config ARCH_HAS_PMEM_API
bool
+config ARCH_HAS_MMIO_FLUSH
+ bool
+
endmenu
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index e667579..98f2881 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -1,8 +1,10 @@
ldflags-y += --wrap=ioremap_wc
+ldflags-y += --wrap=memremap
ldflags-y += --wrap=devm_ioremap_nocache
ldflags-y += --wrap=devm_memremap
ldflags-y += --wrap=ioremap_nocache
ldflags-y += --wrap=iounmap
+ldflags-y += --wrap=memunmap
ldflags-y += --wrap=__devm_request_region
ldflags-y += --wrap=__request_region
ldflags-y += --wrap=__release_region
diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c
index ff1e004..179d228 100644
--- a/tools/testing/nvdimm/test/iomap.c
+++ b/tools/testing/nvdimm/test/iomap.c
@@ -89,12 +89,25 @@ void *__wrap_devm_memremap(struct device *dev, resource_size_t offset,
nfit_res = get_nfit_res(offset);
rcu_read_unlock();
if (nfit_res)
- return (void __iomem *) nfit_res->buf + offset
- - nfit_res->res->start;
+ return nfit_res->buf + offset - nfit_res->res->start;
return devm_memremap(dev, offset, size, flags);
}
EXPORT_SYMBOL(__wrap_devm_memremap);
+void *__wrap_memremap(resource_size_t offset, size_t size,
+ unsigned long flags)
+{
+ struct nfit_test_resource *nfit_res;
+
+ rcu_read_lock();
+ nfit_res = get_nfit_res(offset);
+ rcu_read_unlock();
+ if (nfit_res)
+ return nfit_res->buf + offset - nfit_res->res->start;
+ return memremap(offset, size, flags);
+}
+EXPORT_SYMBOL(__wrap_memremap);
+
void __iomem *__wrap_ioremap_nocache(resource_size_t offset, unsigned long size)
{
return __nfit_test_ioremap(offset, size, ioremap_nocache);
@@ -120,6 +133,19 @@ void __wrap_iounmap(volatile void __iomem *addr)
}
EXPORT_SYMBOL(__wrap_iounmap);
+void __wrap_memunmap(void *addr)
+{
+ struct nfit_test_resource *nfit_res;
+
+ rcu_read_lock();
+ nfit_res = get_nfit_res((unsigned long) addr);
+ rcu_read_unlock();
+ if (nfit_res)
+ return;
+ return memunmap(addr);
+}
+EXPORT_SYMBOL(__wrap_memunmap);
+
static struct resource *nfit_test_request_region(struct device *dev,
struct resource *parent, resource_size_t start,
resource_size_t n, const char *name, int flags)
diff --git a/tools/testing/nvdimm/test/nfit.c b/tools/testing/nvdimm/test/nfit.c
index 28dba91..021e6f9 100644
--- a/tools/testing/nvdimm/test/nfit.c
+++ b/tools/testing/nvdimm/test/nfit.c
@@ -1029,9 +1029,13 @@ static int nfit_test_blk_do_io(struct nd_blk_region *ndbr, resource_size_t dpa,
lane = nd_region_acquire_lane(nd_region);
if (rw)
- memcpy(mmio->base + dpa, iobuf, len);
- else
- memcpy(iobuf, mmio->base + dpa, len);
+ memcpy(mmio->addr.base + dpa, iobuf, len);
+ else {
+ memcpy(iobuf, mmio->addr.base + dpa, len);
+
+ /* give us some some coverage of the mmio_flush_range() API */
+ mmio_flush_range(mmio->addr.base + dpa, len);
+ }
nd_region_release_lane(nd_region, lane);
return 0;
--
2.1.0
6 years, 11 months
[RFC PATCH 0/7] 'struct page' driver for persistent memory
by Dan Williams
When we last left this debate [1] it was becoming clear that the
'page-less' approach left too many I/O scenarios off the table. The
page-less enabling is still useful for avoiding the overhead of struct
page where it is not needed, but in the end, page-backed persistent
memory seems to be a requirement.
With that assumption in place the next debate was where to allocate the
storage for the memmap array, or otherwise reduce the overhead of 'struct
page' with a fancier object like variable length pages.
This series takes the position of mapping persistent memory with
standard 'struct page' and pushes the policy decision of allocating the
storage for the memmap array, from RAM or PMEM, to userspace. It turns
out the best place to allocate 64-bytes per 4K page will be platform
specific.
If PMEM capacities are low then mapping in RAM is a good choice.
Otherwise, for very large capacities storing the memmap in PMEM might be
a better choice. Yet again, PMEM might not have the performance
characteristics favorable to a high rate of change object like 'struct
page'. The kernel can make a reasonable guess, but it seems we will need
to maintain the ability to override any default.
Outside of the new libvdimm sysfs mechanisms to specify the memmap
allocation policy for a given PMEM device, the core of this
implementation is 'struct vmem_altmap'. 'vmem_altmap' alters the memory
hotplug code to optionally use a reserved PMEM-pfn range rather than
dynamic allocation for the memmap.
Only lightly tested so far to confirm valid pfn_to_page() and
page_address() conversions across a range of persistent memory specified
by 'memmap=ss!nn' (kernel command line option to simulate a PMEM
range).
[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-May/000748.html
---
Dan Williams (7):
x86, mm: ZONE_DEVICE for "device memory"
x86, mm: introduce struct vmem_altmap
x86, mm: arch_add_dev_memory()
mm: register_dev_memmap()
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
libnvdimm, pfn: 'struct page' provider infrastructure
libnvdimm, pmem: 'struct page' for pmem
arch/powerpc/mm/init_64.c | 7 +
arch/x86/Kconfig | 19 ++
arch/x86/include/uapi/asm/e820.h | 2
arch/x86/kernel/Makefile | 2
arch/x86/kernel/pmem.c | 79 +--------
arch/x86/mm/init_64.c | 160 +++++++++++++-----
drivers/nvdimm/Kconfig | 26 +++
drivers/nvdimm/Makefile | 5 +
drivers/nvdimm/btt.c | 8 -
drivers/nvdimm/btt_devs.c | 172 +------------------
drivers/nvdimm/claim.c | 201 ++++++++++++++++++++++
drivers/nvdimm/e820.c | 86 ++++++++++
drivers/nvdimm/namespace_devs.c | 34 +++-
drivers/nvdimm/nd-core.h | 9 +
drivers/nvdimm/nd.h | 59 ++++++-
drivers/nvdimm/pfn.h | 35 ++++
drivers/nvdimm/pfn_devs.c | 334 +++++++++++++++++++++++++++++++++++++
drivers/nvdimm/pmem.c | 213 +++++++++++++++++++++++-
drivers/nvdimm/region.c | 2
drivers/nvdimm/region_devs.c | 19 ++
include/linux/kmap_pfn.h | 33 ++++
include/linux/memory_hotplug.h | 21 ++
include/linux/mm.h | 53 ++++++
include/linux/mmzone.h | 23 +++
mm/kmap_pfn.c | 195 ++++++++++++++++++++++
mm/memory_hotplug.c | 84 ++++++---
mm/page_alloc.c | 18 ++
mm/sparse-vmemmap.c | 60 ++++++-
mm/sparse.c | 44 +++--
tools/testing/nvdimm/Kbuild | 7 +
tools/testing/nvdimm/test/iomap.c | 13 +
31 files changed, 1673 insertions(+), 350 deletions(-)
create mode 100644 drivers/nvdimm/claim.c
create mode 100644 drivers/nvdimm/e820.c
create mode 100644 drivers/nvdimm/pfn.h
create mode 100644 drivers/nvdimm/pfn_devs.c
6 years, 11 months
[PATCH] nvdimm: change to use generic kvfree()
by yalin wang
Signed-off-by: yalin wang <yalin.wang2010(a)gmail.com>
---
drivers/nvdimm/dimm_devs.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index c05eb80..651b8d1 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -241,10 +241,7 @@ void nvdimm_drvdata_release(struct kref *kref)
nvdimm_free_dpa(ndd, res);
nvdimm_bus_unlock(dev);
- if (ndd->data && is_vmalloc_addr(ndd->data))
- vfree(ndd->data);
- else
- kfree(ndd->data);
+ kvfree(ndd->data);
kfree(ndd);
put_device(dev);
}
--
1.9.1
6 years, 12 months
[PATCH v5 0/7] dax: I/O path enhancements
by Ross Zwisler
The goal of this series is to enhance the DAX I/O path so that all operations
that store data (I/O writes, zeroing blocks, punching holes, etc.) properly
synchronize the stores to media using the PMEM API. This ensures that the
data DAX is writing is durable on media before the operation completes.
Patches 1-4 are a few random cleanups.
Changes from v4:
- rebased to libnvdimm-for-next branch:
https://git.kernel.org/cgit/linux/kernel/git/nvdimm/nvdimm.git/commit/?h=...
The nvdimm repository doesn't have the DAX PMD changes that are in the -mm
tree. I expect the merge will basically be these two hunks:
@@ -514,7 +528,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
unsigned long pmd_addr = address & PMD_MASK;
bool write = flags & FAULT_FLAG_WRITE;
long length;
- void *kaddr;
+ void __pmem *kaddr;
pgoff_t size, pgoff;
sector_t block, sector;
unsigned long pfn;
@@ -608,7 +622,8 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
if (buffer_unwritten(&bh) || buffer_new(&bh)) {
int i;
for (i = 0; i < PTRS_PER_PMD; i++)
- clear_page(kaddr + i * PAGE_SIZE);
+ clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
+ wmb_pmem();
count_vm_event(PGMAJFAULT);
mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
result |= VM_FAULT_MAJOR;
Ross Zwisler (7):
brd: make rd_size static
pmem, x86: move x86 PMEM API to new pmem.h header
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: clean up conditional pmem includes
pmem: add copy_from_iter_pmem() and clear_pmem()
dax: update I/O path to do proper PMEM flushing
pmem, dax: have direct_access use __pmem annotation
Documentation/filesystems/Locking | 3 +-
MAINTAINERS | 1 +
arch/powerpc/sysdev/axonram.c | 7 +-
arch/x86/include/asm/cacheflush.h | 71 -----------------
arch/x86/include/asm/pmem.h | 158 ++++++++++++++++++++++++++++++++++++++
drivers/block/brd.c | 6 +-
drivers/nvdimm/pmem.c | 4 +-
drivers/s390/block/dcssblk.c | 10 ++-
fs/block_dev.c | 2 +-
fs/dax.c | 63 +++++++++------
include/linux/blkdev.h | 8 +-
include/linux/pmem.h | 77 ++++++++++++++++---
12 files changed, 285 insertions(+), 125 deletions(-)
create mode 100644 arch/x86/include/asm/pmem.h
--
2.1.0
6 years, 12 months
kexec, x86: Need a new e820 type support for kexec
by Toshi Kani
Hello,
ACPI 6.0 defines a new type in e820, AddressRangePersistentMemory (7), for
NVDIMM. On a system with NVDIMM, kexec displays the following error
message and sets it to RANGE_RESERVED as the fallback type.
Unknown type (Persistent Memory) while parsing
/sys/firmware/memmap/34/type. Please report this as bug. Using
RANGE_RESERVED now.
This new type is defined in "arch/x86/include/uapi/asm/e820.h" in 4.2-rc1
as follows.
#define E820_PMEM 7
kexec needs to know this new type, but I think its build env includes
"/usr/include/asm/e820.h", which is provided by a distribution. On Fedora
22, kernel-headers-4.0.6-300.fc22.x86_64 is the latest kernel header
package and it will take a while for 4.2 headers.
How do we handle such kernel header dependency in the kexec build env?
Thanks,
-Toshi
6 years, 12 months
[PATCH v2] libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
by Dan Williams
We currently register a platform device for e820 type-12 memory and
register a nvdimm bus beneath it. Registering the platform device
triggers the device-core machinery to probe for a driver, but that
search currently comes up empty. Building the nvdimm-bus registration
into the e820_pmem platform device registration in this way forces
libnvdimm to be built-in. Instead, convert the built-in portion of
CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
rest of the logic to the driver for e820_pmem, for the following
reasons:
1/ Letting e820_pmem support be a module allows building and testing
libnvdimm.ko changes without rebooting
2/ All the normal policy around modules can be applied to e820_pmem
(unbind to disable and/or blacklisting the module from loading by
default)
3/ Moving the driver to a generic location and converting it to scan
"iomem_resource" rather than "e820.map" means any other architecture can
take advantage of this simple nvdimm resource discovery mechanism by
registering a resource named "Persistent Memory (legacy)"
Cc: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
No code changes since v1, just a revised changelog.
arch/x86/Kconfig | 6 ++-
arch/x86/include/uapi/asm/e820.h | 2 -
arch/x86/kernel/Makefile | 2 -
arch/x86/kernel/pmem.c | 79 ++++-------------------------------
drivers/nvdimm/Makefile | 3 +
drivers/nvdimm/e820.c | 86 ++++++++++++++++++++++++++++++++++++++
tools/testing/nvdimm/Kbuild | 4 ++
7 files changed, 108 insertions(+), 74 deletions(-)
create mode 100644 drivers/nvdimm/e820.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b3a1a5d77d92..76c61154ed50 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1426,10 +1426,14 @@ config ILLEGAL_POINTER_VALUE
source "mm/Kconfig"
+config X86_PMEM_LEGACY_DEVICE
+ bool
+
config X86_PMEM_LEGACY
- bool "Support non-standard NVDIMMs and ADR protected memory"
+ tristate "Support non-standard NVDIMMs and ADR protected memory"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
+ select X86_PMEM_LEGACY_DEVICE
select LIBNVDIMM
help
Treat memory marked using the non-standard e820 type of 12 as used
diff --git a/arch/x86/include/uapi/asm/e820.h b/arch/x86/include/uapi/asm/e820.h
index 0f457e6eab18..9dafe59cf6e2 100644
--- a/arch/x86/include/uapi/asm/e820.h
+++ b/arch/x86/include/uapi/asm/e820.h
@@ -37,7 +37,7 @@
/*
* This is a non-standardized way to represent ADR or NVDIMM regions that
* persist over a reboot. The kernel will ignore their special capabilities
- * unless the CONFIG_X86_PMEM_LEGACY=y option is set.
+ * unless the CONFIG_X86_PMEM_LEGACY option is set.
*
* ( Note that older platforms also used 6 for the same type of memory,
* but newer versions switched to 12 as 6 was assigned differently. Some
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0f15af41bd80..ac2bb7e28ba2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -92,7 +92,7 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
-obj-$(CONFIG_X86_PMEM_LEGACY) += pmem.o
+obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
obj-$(CONFIG_PCSPKR_PLATFORM) += pcspeaker.o
diff --git a/arch/x86/kernel/pmem.c b/arch/x86/kernel/pmem.c
index 64f90f53bb85..4f00b63d7ff3 100644
--- a/arch/x86/kernel/pmem.c
+++ b/arch/x86/kernel/pmem.c
@@ -3,80 +3,17 @@
* Copyright (c) 2015, Intel Corporation.
*/
#include <linux/platform_device.h>
-#include <linux/libnvdimm.h>
#include <linux/module.h>
-#include <asm/e820.h>
-
-static void e820_pmem_release(struct device *dev)
-{
- struct nvdimm_bus *nvdimm_bus = dev->platform_data;
-
- if (nvdimm_bus)
- nvdimm_bus_unregister(nvdimm_bus);
-}
-
-static struct platform_device e820_pmem = {
- .name = "e820_pmem",
- .id = -1,
- .dev = {
- .release = e820_pmem_release,
- },
-};
-
-static const struct attribute_group *e820_pmem_attribute_groups[] = {
- &nvdimm_bus_attribute_group,
- NULL,
-};
-
-static const struct attribute_group *e820_pmem_region_attribute_groups[] = {
- &nd_region_attribute_group,
- &nd_device_attribute_group,
- NULL,
-};
static __init int register_e820_pmem(void)
{
- static struct nvdimm_bus_descriptor nd_desc;
- struct device *dev = &e820_pmem.dev;
- struct nvdimm_bus *nvdimm_bus;
- int rc, i;
-
- rc = platform_device_register(&e820_pmem);
- if (rc)
- return rc;
-
- nd_desc.attr_groups = e820_pmem_attribute_groups;
- nd_desc.provider_name = "e820";
- nvdimm_bus = nvdimm_bus_register(dev, &nd_desc);
- if (!nvdimm_bus)
- goto err;
- dev->platform_data = nvdimm_bus;
-
- for (i = 0; i < e820.nr_map; i++) {
- struct e820entry *ei = &e820.map[i];
- struct resource res = {
- .flags = IORESOURCE_MEM,
- .start = ei->addr,
- .end = ei->addr + ei->size - 1,
- };
- struct nd_region_desc ndr_desc;
-
- if (ei->type != E820_PRAM)
- continue;
-
- memset(&ndr_desc, 0, sizeof(ndr_desc));
- ndr_desc.res = &res;
- ndr_desc.attr_groups = e820_pmem_region_attribute_groups;
- ndr_desc.numa_node = NUMA_NO_NODE;
- if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc))
- goto err;
- }
-
- return 0;
-
- err:
- dev_err(dev, "failed to register legacy persistent memory ranges\n");
- platform_device_unregister(&e820_pmem);
- return -ENXIO;
+ struct platform_device *pdev;
+
+ /*
+ * See drivers/nvdimm/e820.c for the implementation, this is
+ * simply here to trigger the module to load on demand.
+ */
+ pdev = platform_device_alloc("e820_pmem", -1);
+ return platform_device_add(pdev);
}
device_initcall(register_e820_pmem);
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 594bb97c867a..9bf15db52dee 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
nd_pmem-y := pmem.o
@@ -9,6 +10,8 @@ nd_btt-y := btt.o
nd_blk-y := blk.o
+nd_e820-y := e820.o
+
libnvdimm-y := core.o
libnvdimm-y += bus.o
libnvdimm-y += dimm_devs.o
diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c
new file mode 100644
index 000000000000..1b5743ad92db
--- /dev/null
+++ b/drivers/nvdimm/e820.c
@@ -0,0 +1,86 @@
+/*
+ * Copyright (c) 2015, Christoph Hellwig.
+ * Copyright (c) 2015, Intel Corporation.
+ */
+#include <linux/platform_device.h>
+#include <linux/libnvdimm.h>
+#include <linux/module.h>
+
+static const struct attribute_group *e820_pmem_attribute_groups[] = {
+ &nvdimm_bus_attribute_group,
+ NULL,
+};
+
+static const struct attribute_group *e820_pmem_region_attribute_groups[] = {
+ &nd_region_attribute_group,
+ &nd_device_attribute_group,
+ NULL,
+};
+
+static int e820_pmem_remove(struct platform_device *pdev)
+{
+ struct nvdimm_bus *nvdimm_bus = platform_get_drvdata(pdev);
+
+ nvdimm_bus_unregister(nvdimm_bus);
+ return 0;
+}
+
+static int e820_pmem_probe(struct platform_device *pdev)
+{
+ static struct nvdimm_bus_descriptor nd_desc;
+ struct device *dev = &pdev->dev;
+ struct nvdimm_bus *nvdimm_bus;
+ struct resource *p;
+
+ nd_desc.attr_groups = e820_pmem_attribute_groups;
+ nd_desc.provider_name = "e820";
+ nvdimm_bus = nvdimm_bus_register(dev, &nd_desc);
+ if (!nvdimm_bus)
+ goto err;
+ platform_set_drvdata(pdev, nvdimm_bus);
+
+ for (p = iomem_resource.child; p ; p = p->sibling) {
+ struct nd_region_desc ndr_desc;
+
+ if (strncmp(p->name, "Persistent Memory (legacy)", 26) != 0)
+ continue;
+
+ memset(&ndr_desc, 0, sizeof(ndr_desc));
+ ndr_desc.res = p;
+ ndr_desc.attr_groups = e820_pmem_region_attribute_groups;
+ ndr_desc.numa_node = NUMA_NO_NODE;
+ if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc))
+ goto err;
+ }
+
+ return 0;
+
+ err:
+ nvdimm_bus_unregister(nvdimm_bus);
+ dev_err(dev, "failed to register legacy persistent memory ranges\n");
+ return -ENXIO;
+}
+
+static struct platform_driver e820_pmem_driver = {
+ .probe = e820_pmem_probe,
+ .remove = e820_pmem_remove,
+ .driver = {
+ .name = "e820_pmem",
+ },
+};
+
+static __init int e820_pmem_init(void)
+{
+ return platform_driver_register(&e820_pmem_driver);
+}
+
+static __exit void e820_pmem_exit(void)
+{
+ platform_driver_unregister(&e820_pmem_driver);
+}
+
+MODULE_ALIAS("platform:e820_pmem*");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
+module_init(e820_pmem_init);
+module_exit(e820_pmem_exit);
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index f56914c7929b..d7c136a96346 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -15,6 +15,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_ACPI_NFIT) += nfit.o
nfit-y := $(ACPI_SRC)/nfit.o
@@ -29,6 +30,9 @@ nd_btt-y += config_check.o
nd_blk-y := $(NVDIMM_SRC)/blk.o
nd_blk-y += config_check.o
+nd_e820-y := $(NVDIMM_SRC)/e820.o
+nd_e820-y += config_check.o
+
libnvdimm-y := $(NVDIMM_SRC)/core.o
libnvdimm-y += $(NVDIMM_SRC)/bus.o
libnvdimm-y += $(NVDIMM_SRC)/dimm_devs.o
6 years, 12 months
[PATCH v4 0/7] dax: I/O path enhancements
by Ross Zwisler
The goal of this series is to enhance the DAX I/O path so that all operations
that store data (I/O writes, zeroing blocks, punching holes, etc.) properly
synchronize the stores to media using the PMEM API. This ensures that the data
DAX is writing is durable on media before the operation completes.
Patches 1-4 are a few random cleanups.
Changes from v3 (all in patch 5):
- moved <linux/uio.h> include from x86 pmem.h to linux/pmem.h (Christoph)
- made some local void* variables where apporpriate to cut down on __force
casts from __pmem (Christoph)
- made a __iter_needs_pmem_wb() helper and added a TODO to move to
non-temporal stores (Christoph)
Ross Zwisler (7):
brd: make rd_size static
pmem, x86: move x86 PMEM API to new pmem.h header
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: clean up conditional pmem includes
pmem: add copy_from_iter_pmem() and clear_pmem()
dax: update I/O path to do proper PMEM flushing
pmem, dax: have direct_access use __pmem annotation
Documentation/filesystems/Locking | 3 +-
MAINTAINERS | 1 +
arch/powerpc/sysdev/axonram.c | 7 +-
arch/x86/include/asm/cacheflush.h | 71 -----------------
arch/x86/include/asm/pmem.h | 158 ++++++++++++++++++++++++++++++++++++++
drivers/block/brd.c | 6 +-
drivers/nvdimm/pmem.c | 4 +-
drivers/s390/block/dcssblk.c | 10 ++-
fs/block_dev.c | 2 +-
fs/dax.c | 68 +++++++++-------
include/linux/blkdev.h | 8 +-
include/linux/pmem.h | 79 +++++++++++++++----
12 files changed, 289 insertions(+), 128 deletions(-)
create mode 100644 arch/x86/include/asm/pmem.h
--
2.1.0
6 years, 12 months