[PATCH AUTOSEL 5.10 34/51] arch/arc: add copy_user_page() to <asm/page.h> to fix build error on ARC
by Sasha Levin
From: Randy Dunlap <rdunlap(a)infradead.org>
[ Upstream commit 8a48c0a3360bf2bf4f40c980d0ec216e770e58ee ]
fs/dax.c uses copy_user_page() but ARC does not provide that interface,
resulting in a build error.
Provide copy_user_page() in <asm/page.h>.
../fs/dax.c: In function 'copy_cow_page_dax':
../fs/dax.c:702:2: error: implicit declaration of function 'copy_user_page'; did you mean 'copy_to_user_page'? [-Werror=implicit-function-declaration]
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vineet Gupta <vgupta(a)synopsys.com>
Cc: linux-snps-arc(a)lists.infradead.org
Cc: Dan Williams <dan.j.williams(a)intel.com>
#Acked-by: Vineet Gupta <vgupta(a)synopsys.com> # v1
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-nvdimm(a)lists.01.org
#Reviewed-by: Ira Weiny <ira.weiny(a)intel.com> # v2
Signed-off-by: Vineet Gupta <vgupta(a)synopsys.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arc/include/asm/page.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h
index b0dfed0f12be0..d9c264dc25fcb 100644
--- a/arch/arc/include/asm/page.h
+++ b/arch/arc/include/asm/page.h
@@ -10,6 +10,7 @@
#ifndef __ASSEMBLY__
#define clear_page(paddr) memset((paddr), 0, PAGE_SIZE)
+#define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
#define copy_page(to, from) memcpy((to), (from), PAGE_SIZE)
struct vm_area_struct;
--
2.27.0
1 year, 4 months
[ndctl RFC PATCH 0/5] Initial CXL support
by Vishal Verma
This is an RFC patchset to add a new utility and library to support
CXL devices. This comprehends the kernel's sysfs layout for CXL
devices, and implements a command submission harness for CXL mailbox
commands via ioctl()s definied by the cxl_mem driver.
These patches include:
- libcxl representation of cxl_mem devices
- A command submission harness through libcxl
- A 'cxl-list' command which displays information about a device
Things missing, or next steps are:
- A test/libcxl.c to exercise all library interfaces
- Testing 'vendor specific' commands exported by the QEMU
implementation[1]
- API documentation
The latest kernel patches can be found at [2].
An ndctl branch with these patches is also available at [3]
[1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@i...
[2]: https://gitlab.com/bwidawsk/linux/-/commits/cxl-2.0v3
[3]: https://github.com/pmem/ndctl/tree/cxl-2.0v1
Vishal Verma (5):
cxl: add a cxl utility and libcxl library
cxl: add a local copy of the cxl_mem UAPI header
libcxl: add support for command query and submission
libcxl: add accessors for retrieving 'Identify' information
cxl/list: augment cxl-list with more data from the identify command
Documentation/cxl/cxl-list.txt | 65 +++
Documentation/cxl/cxl.txt | 34 ++
Documentation/cxl/human-option.txt | 8 +
Documentation/cxl/verbose-option.txt | 5 +
configure.ac | 3 +
Makefile.am | 4 +-
Makefile.am.in | 5 +
cxl/lib/private.h | 87 ++++
cxl/lib/libcxl.c | 714 +++++++++++++++++++++++++++
cxl/builtin.h | 8 +
cxl/cxl_mem.h | 176 +++++++
cxl/libcxl.h | 66 +++
util/filter.h | 2 +
util/json.h | 4 +
util/main.h | 3 +
cxl/cxl.c | 95 ++++
cxl/list.c | 138 ++++++
util/filter.c | 20 +
util/json.c | 46 ++
Documentation/cxl/Makefile.am | 58 +++
cxl/Makefile.am | 21 +
cxl/lib/Makefile.am | 32 ++
cxl/lib/libcxl.pc.in | 11 +
cxl/lib/libcxl.sym | 41 ++
24 files changed, 1644 insertions(+), 2 deletions(-)
create mode 100644 Documentation/cxl/cxl-list.txt
create mode 100644 Documentation/cxl/cxl.txt
create mode 100644 Documentation/cxl/human-option.txt
create mode 100644 Documentation/cxl/verbose-option.txt
create mode 100644 cxl/lib/private.h
create mode 100644 cxl/lib/libcxl.c
create mode 100644 cxl/builtin.h
create mode 100644 cxl/cxl_mem.h
create mode 100644 cxl/libcxl.h
create mode 100644 cxl/cxl.c
create mode 100644 cxl/list.c
create mode 100644 Documentation/cxl/Makefile.am
create mode 100644 cxl/Makefile.am
create mode 100644 cxl/lib/Makefile.am
create mode 100644 cxl/lib/libcxl.pc.in
create mode 100644 cxl/lib/libcxl.sym
--
2.29.2
1 year, 4 months
[RFC v2] nvfs: a filesystem for persistent memory
by Mikulas Patocka
Hi
I announce a new version of NVFS - a filesystem for persistent memory.
http://people.redhat.com/~mpatocka/nvfs/
git://leontynka.twibright.com/nvfs.git
Changes since the last release:
* I added a microjournal to the filesystem, it can hold up to 16 entries.
Each CPU has it's own journal, so that there is no lock contention. The
journal is used to provide atomicity of reaname() and extended attribute
replace.
(note that file creation or deletion doesn't use the journal, because
these operations can be deterministically cleaned up by fsck)
* I created a framework that can be used to verify the filesystem driver.
It logs all writes and memory barriers to a file, the entries in the
file are randomly reordered (to simulate reordering in the CPU
write-combining buffers), the sequence is cut at a random point (to
simulate a system crash) and the result is replayed on a filesystem
image.
With this framework, we can for example check that if a crash happens
during rename(), either old file or new file will be present in a
directory.
This framework helped to find a few bugs in sequencing the writes.
* If we map an executable image, we turn off the DAX flag on the inode
(because executables run 4% slower from persistent memory). There is
also a switch that can turn DAX always off or always on.
I'd like to ask about this piece of code in __kernel_read:
if (unlikely(!file->f_op->read_iter || file->f_op->read))
return warn_unsupported...
and __kernel_write:
if (unlikely(!file->f_op->write_iter || file->f_op->write))
return warn_unsupported...
- It exits with an error if both read_iter and read or write_iter and
write are present.
I found out that on NVFS, reading a file with the read method has 10%
better performance than the read_iter method. The benchmark just reads the
same 4k page over and over again - and the cost of creating and parsing
the kiocb and iov_iter structures is just that high.
So, I'd like to have both read and read_iter methods. Could the above
conditions be changed, so that they don't fail with an error if the "read"
or "write" method is present?
Mikulas
1 year, 4 months
[RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink
by Shiyang Ruan
This patchset is a try to resolve the problem of tracking shared page
for fsdax.
Change from v2:
- Adjust the order of patches
- Divide the infrastructure and the drivers that use it
- Rebased to v5.10
Change from v1:
- Introduce ->block_lost() for block device
- Support mapped device
- Add 'not available' warning for realtime device in XFS
- Rebased to v5.10-rc1
This patchset moves owner tracking from dax_assocaite_entry() to pmem
device driver, by introducing an interface ->memory_failure() of struct
pagemap. This interface is called by memory_failure() in mm, and
implemented by pmem device. Then pmem device calls its ->corrupted_range()
to find the filesystem which the corrupted data located in, and call
filesystem handler to track files or metadata assocaited with this page.
Finally we are able to try to fix the corrupted data in filesystem and do
other necessary processing, such as killing processes who are using the
files affected.
The call trace is like this:
memory_failure()
pgmap->ops->memory_failure() => pmem_pgmap_memory_failure()
gendisk->fops->corrupted_range() => - pmem_corrupted_range()
- md_blk_corrupted_range()
sb->s_ops->currupted_range() => xfs_fs_corrupted_range()
xfs_rmap_query_range()
xfs_currupt_helper()
* corrupted on metadata
try to recover data, call xfs_force_shutdown()
* corrupted on file data
try to recover data, call mf_dax_mapping_kill_procs()
The fsdax & reflink support for XFS is not contained in this patchset.
(Rebased on v5.10)
--
Shiyang Ruan (9):
pagemap: Introduce ->memory_failure()
blk: Introduce ->corrupted_range() for block device
fs: Introduce ->corrupted_range() for superblock
mm, fsdax: Refactor memory-failure handler for dax mapping
mm, pmem: Implement ->memory_failure() in pmem driver
pmem: Implement ->corrupted_range() for pmem driver
dm: Introduce ->rmap() to find bdev offset
md: Implement ->corrupted_range()
xfs: Implement ->corrupted_range() for XFS
block/genhd.c | 12 +++
drivers/md/dm-linear.c | 8 ++
drivers/md/dm.c | 66 +++++++++++++++
drivers/nvdimm/pmem.c | 51 ++++++++++++
fs/block_dev.c | 21 +++++
fs/dax.c | 24 +++---
fs/xfs/xfs_fsops.c | 10 +++
fs/xfs/xfs_mount.h | 2 +
fs/xfs/xfs_super.c | 93 +++++++++++++++++++++
include/linux/blkdev.h | 2 +
include/linux/dax.h | 5 +-
include/linux/device-mapper.h | 2 +
include/linux/fs.h | 2 +
include/linux/genhd.h | 8 ++
include/linux/memremap.h | 8 ++
include/linux/mm.h | 9 ++
mm/memory-failure.c | 150 +++++++++++++++++++---------------
17 files changed, 391 insertions(+), 82 deletions(-)
--
2.29.2
1 year, 4 months
[PATCH ndctl rebased 1/3] ndctl/namespace: Skip seed namespaces when processing all namespaces.
by Michal Suchanek
The seed namespaces are exposed by the kernel but most operations are
not valid on seed namespaces.
When processing all namespaces the user gets confusing errors from ndctl
trying to process seed namespaces. The kernel does not provide any way
to tell that a namspace is seed namespace but skipping namespaces with
zero size and UUID is a good heuristic.
The user can still specify the namespace by name directly in case
processing it is desirable.
Fixes: #41
Link: https://patchwork.kernel.org/patch/11473645/
Reviewed-by: Santosh S <santosh(a)fossix.org>
Tested-by: Harish Sriram <harish(a)linux.ibm.com>
Signed-off-by: Michal Suchanek <msuchanek(a)suse.de>
---
ndctl/namespace.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index 0c8df9fa8b47..b9ffd21fe7bf 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -2207,9 +2207,19 @@ static int do_xaction_namespace(const char *namespace,
ndctl_namespace_foreach_safe(region, ndns, _n) {
ndns_name = ndctl_namespace_get_devname(ndns);
- if (strcmp(namespace, "all") != 0
- && strcmp(namespace, ndns_name) != 0)
- continue;
+ if (strcmp(namespace, "all") == 0) {
+ static const uuid_t zero_uuid;
+ uuid_t uuid;
+
+ ndctl_namespace_get_uuid(ndns, uuid);
+ if (!ndctl_namespace_get_size(ndns) &&
+ !memcmp(uuid, zero_uuid, sizeof(uuid_t)))
+ continue;
+ } else {
+ if (strcmp(namespace, ndns_name) != 0)
+ continue;
+ }
+
switch (action) {
case ACTION_DISABLE:
rc = ndctl_namespace_disable_safe(ndns);
--
2.26.2
1 year, 4 months
[PATCH v2] libnvdimm/pmem: remove unused header.
by Jianpeng Ma
'commit a8b456d01cd6 ("bdi: remove BDI_CAP_SYNCHRONOUS_IO")' forgot
remove the related header file.
Fixes: a8b456d01cd6 ("bdi: remove BDI_CAP_SYNCHRONOUS_IO")
Signed-off-by: Jianpeng Ma <jianpeng.ma(a)intel.com>
---
drivers/nvdimm/pmem.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 875076b0ea6c..f33bdae626ba 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -23,7 +23,6 @@
#include <linux/uio.h>
#include <linux/dax.h>
#include <linux/nd.h>
-#include <linux/backing-dev.h>
#include <linux/mm.h>
#include <asm/cacheflush.h>
#include "pmem.h"
--
2.29.2
1 year, 4 months
[PATCH] ACPI: NFIT: Fix flexible_array.cocci warnings
by Dan Williams
Julia and 0day report:
Zero-length and one-element arrays are deprecated, see
Documentation/process/deprecated.rst
Flexible-array members should be used instead.
However, a straight conversion to flexible arrays yields:
drivers/acpi/nfit/core.c:2276:4: error: flexible array member in a struct with no named members
drivers/acpi/nfit/core.c:2287:4: error: flexible array member in a struct with no named members
Instead, just use plain arrays not embedded a flexible arrays.
Cc: Denis Efremov <efremov(a)linux.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Julia Lawall <julia.lawall(a)inria.fr>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/acpi/nfit/core.c | 75 +++++++++++++++++-----------------------------
1 file changed, 28 insertions(+), 47 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index b11b08a60684..8c5dde628405 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2269,40 +2269,24 @@ static const struct attribute_group *acpi_nfit_region_attribute_groups[] = {
/* enough info to uniquely specify an interleave set */
struct nfit_set_info {
- struct nfit_set_info_map {
- u64 region_offset;
- u32 serial_number;
- u32 pad;
- } mapping[0];
+ u64 region_offset;
+ u32 serial_number;
+ u32 pad;
};
struct nfit_set_info2 {
- struct nfit_set_info_map2 {
- u64 region_offset;
- u32 serial_number;
- u16 vendor_id;
- u16 manufacturing_date;
- u8 manufacturing_location;
- u8 reserved[31];
- } mapping[0];
+ u64 region_offset;
+ u32 serial_number;
+ u16 vendor_id;
+ u16 manufacturing_date;
+ u8 manufacturing_location;
+ u8 reserved[31];
};
-static size_t sizeof_nfit_set_info(int num_mappings)
-{
- return sizeof(struct nfit_set_info)
- + num_mappings * sizeof(struct nfit_set_info_map);
-}
-
-static size_t sizeof_nfit_set_info2(int num_mappings)
-{
- return sizeof(struct nfit_set_info2)
- + num_mappings * sizeof(struct nfit_set_info_map2);
-}
-
static int cmp_map_compat(const void *m0, const void *m1)
{
- const struct nfit_set_info_map *map0 = m0;
- const struct nfit_set_info_map *map1 = m1;
+ const struct nfit_set_info *map0 = m0;
+ const struct nfit_set_info *map1 = m1;
return memcmp(&map0->region_offset, &map1->region_offset,
sizeof(u64));
@@ -2310,8 +2294,8 @@ static int cmp_map_compat(const void *m0, const void *m1)
static int cmp_map(const void *m0, const void *m1)
{
- const struct nfit_set_info_map *map0 = m0;
- const struct nfit_set_info_map *map1 = m1;
+ const struct nfit_set_info *map0 = m0;
+ const struct nfit_set_info *map1 = m1;
if (map0->region_offset < map1->region_offset)
return -1;
@@ -2322,8 +2306,8 @@ static int cmp_map(const void *m0, const void *m1)
static int cmp_map2(const void *m0, const void *m1)
{
- const struct nfit_set_info_map2 *map0 = m0;
- const struct nfit_set_info_map2 *map1 = m1;
+ const struct nfit_set_info2 *map0 = m0;
+ const struct nfit_set_info2 *map1 = m1;
if (map0->region_offset < map1->region_offset)
return -1;
@@ -2361,22 +2345,22 @@ static int acpi_nfit_init_interleave_set(struct acpi_nfit_desc *acpi_desc,
return -ENOMEM;
import_guid(&nd_set->type_guid, spa->range_guid);
- info = devm_kzalloc(dev, sizeof_nfit_set_info(nr), GFP_KERNEL);
+ info = devm_kcalloc(dev, nr, sizeof(*info), GFP_KERNEL);
if (!info)
return -ENOMEM;
- info2 = devm_kzalloc(dev, sizeof_nfit_set_info2(nr), GFP_KERNEL);
+ info2 = devm_kcalloc(dev, nr, sizeof(*info2), GFP_KERNEL);
if (!info2)
return -ENOMEM;
for (i = 0; i < nr; i++) {
struct nd_mapping_desc *mapping = &ndr_desc->mapping[i];
- struct nfit_set_info_map *map = &info->mapping[i];
- struct nfit_set_info_map2 *map2 = &info2->mapping[i];
struct nvdimm *nvdimm = mapping->nvdimm;
struct nfit_mem *nfit_mem = nvdimm_provider_data(nvdimm);
- struct acpi_nfit_memory_map *memdev = memdev_from_spa(acpi_desc,
- spa->range_index, i);
+ struct nfit_set_info *map = &info[i];
+ struct nfit_set_info2 *map2 = &info2[i];
+ struct acpi_nfit_memory_map *memdev =
+ memdev_from_spa(acpi_desc, spa->range_index, i);
struct acpi_nfit_control_region *dcr = nfit_mem->dcr;
if (!memdev || !nfit_mem->dcr) {
@@ -2395,23 +2379,20 @@ static int acpi_nfit_init_interleave_set(struct acpi_nfit_desc *acpi_desc,
}
/* v1.1 namespaces */
- sort(&info->mapping[0], nr, sizeof(struct nfit_set_info_map),
- cmp_map, NULL);
- nd_set->cookie1 = nd_fletcher64(info, sizeof_nfit_set_info(nr), 0);
+ sort(info, nr, sizeof(*info), cmp_map, NULL);
+ nd_set->cookie1 = nd_fletcher64(info, sizeof(*info) * nr, 0);
/* v1.2 namespaces */
- sort(&info2->mapping[0], nr, sizeof(struct nfit_set_info_map2),
- cmp_map2, NULL);
- nd_set->cookie2 = nd_fletcher64(info2, sizeof_nfit_set_info2(nr), 0);
+ sort(info2, nr, sizeof(*info2), cmp_map2, NULL);
+ nd_set->cookie2 = nd_fletcher64(info2, sizeof(*info2) * nr, 0);
/* support v1.1 namespaces created with the wrong sort order */
- sort(&info->mapping[0], nr, sizeof(struct nfit_set_info_map),
- cmp_map_compat, NULL);
- nd_set->altcookie = nd_fletcher64(info, sizeof_nfit_set_info(nr), 0);
+ sort(info, nr, sizeof(*info), cmp_map_compat, NULL);
+ nd_set->altcookie = nd_fletcher64(info, sizeof(*info) * nr, 0);
/* record the result of the sort for the mapping position */
for (i = 0; i < nr; i++) {
- struct nfit_set_info_map2 *map2 = &info2->mapping[i];
+ struct nfit_set_info2 *map2 = &info2[i];
int j;
for (j = 0; j < nr; j++) {
1 year, 4 months
[PATCH] x86/mm: Fix leak of pmd ptlock
by Dan Williams
Commit 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
introduced a new location where a pmd was released, but neglected to run
the pmd page destructor. In fact, this happened previously for a
different pmd release path and was fixed by commit:
c283610e44ec ("x86, mm: do not leak page->ptl for pmd page tables").
This issue was hidden until recently because the failure mode is silent,
but commit:
b2b29d6d0119 ("mm: account PMD tables like PTE tables")
...turns the failure mode into this signature:
BUG: Bad page state in process lt-pmem-ns pfn:15943d
page:000000007262ed7b refcount:0 mapcount:-1024 mapping:0000000000000000 index:0x0 pfn:0x15943d
flags: 0xaffff800000000()
raw: 00affff800000000 dead000000000100 0000000000000000 0000000000000000
raw: 0000000000000000 ffff913a029bcc08 00000000fffffbff 0000000000000000
page dumped because: nonzero mapcount
[..]
dump_stack+0x8b/0xb0
bad_page.cold+0x63/0x94
free_pcp_prepare+0x224/0x270
free_unref_page+0x18/0xd0
pud_free_pmd_page+0x146/0x160
ioremap_pud_range+0xe3/0x350
ioremap_page_range+0x108/0x160
__ioremap_caller.constprop.0+0x174/0x2b0
? memremap+0x7a/0x110
memremap+0x7a/0x110
devm_memremap+0x53/0xa0
pmem_attach_disk+0x4ed/0x530 [nd_pmem]
? __devm_release_region+0x52/0x80
nvdimm_bus_probe+0x85/0x210 [libnvdimm]
Given this is a repeat occurrence it seemed prudent to look for other
places where this destructor might be missing and whether a better
helper is needed. try_to_free_pmd_page() looks like a candidate, but
testing with setting up and tearing down pmd mappings via the dax unit
tests is thus far not triggering the failure. As for a better helper
pmd_free() is close, but it is a messy fit due to requiring an @mm arg.
Also, ___pmd_free_tlb() wants to call paravirt_tlb_remove_table()
instead of free_page(), so open-coded pgtable_pmd_page_dtor() seems the
best way forward for now.
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Cc: <stable(a)vger.kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: x86(a)kernel.org
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Co-debugged-by: Matthew Wilcox <willy(a)infradead.org>
Tested-by: Yi Zhang <yi.zhang(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
arch/x86/mm/pgtable.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index dfd82f51ba66..f6a9e2e36642 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -829,6 +829,8 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr)
}
free_page((unsigned long)pmd_sv);
+
+ pgtable_pmd_page_dtor(virt_to_page(pmd));
free_page((unsigned long)pmd);
return 1;
1 year, 4 months
[PATCH] arm64: add pmem module for kernel update
by Zhuling
Category: feature
Bugzilla: NA
CVE: NA
Use reserved memory to create a pmem device to store the
processes information that dumped before kernel update.
When you want to use this feature you need to declare by
"pmemmem=pmem_size:pmem_phystart" in cmdline.
(exp: pmemmem=100M:0x202000000000)
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/kernel/setup.c | 5 +++
arch/arm64/mm/init.c | 90 +++++++++++++++++++++++++++++++++++++++
drivers/nvdimm/Kconfig | 11 +++++
drivers/nvdimm/Makefile | 3 ++
drivers/nvdimm/kup_pmem.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/ioport.h | 1 +
include/linux/mm.h | 4 ++
lib/Kconfig | 6 +++
8 files changed, 226 insertions(+)
create mode 100644 drivers/nvdimm/kup_pmem.c
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 133257f..0bd9429 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -237,6 +237,11 @@ static void __init request_standard_resources(void)
if (kernel_data.start >= res->start &&
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
+#ifdef CONFIG_KUP_PMEM_MEMORY
+ if (pmem_res.end)
+ insert_resource(&iomem_resource, &pmem_res);
+#endif
+
#ifdef CONFIG_KEXEC_CORE
/* Userspace will find "Crash kernel" region in /proc/iomem. */
if (crashk_res.end && crashk_res.start >= res->start &&
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 0955406..9d1395e 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -62,6 +62,18 @@ EXPORT_SYMBOL(memstart_addr);
phys_addr_t arm64_dma_phys_limit __ro_after_init;
static phys_addr_t arm64_dma32_phys_limit __ro_after_init;
+#ifdef CONFIG_KUP_PMEM_MEMORY
+static unsigned long long pmem_size, pmem_phystart;
+
+struct resource pmem_res = {
+ .name = "Kpmem Dev",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_MEM,
+ .desc = IORES_DESC_KPMEM_DEV
+};
+#endif
+
#ifdef CONFIG_KEXEC_CORE
/*
* reserve_crashkernel() - reserves memory for crash kernel
@@ -123,6 +135,80 @@ static void __init reserve_crashkernel(void)
}
#endif /* CONFIG_KEXEC_CORE */
+#ifdef CONFIG_KUP_PMEM_MEMORY
+/*
+ * reserve_pmem() - reserves memory for pmem
+ *
+ * This function reserves memory area given in "pmemmem=" kernel command
+ * line parameter. The memory reserved is used by pmem restore progress
+ * when kernel update.
+ */
+static int __init parse_pmem(char *par)
+{
+ char *cur = par;
+
+ if (!par)
+ return 0;
+
+ pmem_size = 0;
+ pmem_phystart = 0;
+
+ pmem_size = memparse(par, &cur);
+ if (par == cur) {
+ pr_warn("pmem: memory value expected\n");
+ return -EINVAL;
+ }
+
+ if (*cur == ':')
+ pmem_phystart = memparse(cur+1, &cur);
+ else if (*cur != ' ' && *cur != '\0') {
+ pr_warn("pmem: unrecognized char %c\n", *cur);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+early_param("pmemmem", parse_pmem);
+
+static void __init reserve_pmem(void)
+{
+ if (!pmem_size || !pmem_phystart) {
+ return;
+ }
+
+ pmem_size = PAGE_ALIGN(pmem_size);
+
+ if (!memblock_is_region_memory(pmem_phystart, pmem_size)) {
+ pr_warn("cannot reserve pmem: region is not memory!\n");
+ return;
+ }
+
+ if (memblock_is_region_reserved(pmem_phystart, pmem_size)) {
+ pr_warn("cannot reserve pmem: region overlaps reserved memory!\n");
+ return;
+ }
+
+ if (!IS_ALIGNED(pmem_phystart, SZ_2M)) {
+ pr_warn("cannot reserve pmem: base address is not 2MB aligned\n");
+ return;
+ }
+ memblock_reserve(pmem_phystart, pmem_size);
+ memblock_remove(pmem_phystart, pmem_size);
+ pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+ pmem_phystart, pmem_phystart + pmem_size, pmem_size >> 20);
+
+ pmem_res.start = pmem_phystart;
+ pmem_res.end = pmem_phystart + pmem_size - 1;
+}
+#else
+static void __init reserve_pmem(void)
+{
+}
+static void __init reserve_pmem_pages(void)
+{
+}
+#endif /*CONFIG_KUP_PMEM_MEMORY*/
+
#ifdef CONFIG_CRASH_DUMP
static int __init early_init_dt_scan_elfcorehdr(unsigned long node,
const char *uname, int depth, void *data)
@@ -390,6 +476,10 @@ void __init arm64_memblock_init(void)
reserve_elfcorehdr();
+#ifdef CONFIG_KUP_PMEM_MEMORY
+ reserve_pmem();
+#endif
+
high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
dma_contiguous_reserve(arm64_dma32_phys_limit);
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..7f5fa22 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -119,6 +119,17 @@ config NVDIMM_KEYS
depends on ENCRYPTED_KEYS
depends on (LIBNVDIMM=ENCRYPTED_KEYS) || LIBNVDIMM=m
+config KUP_PMEM
+ tristate "Persistent memory for kernel update"
+ depends on LIBNVDIMM
+ depends on KUP_PMEM_MEMORY
+ default LIBNVDIMM
+ help
+ Allows regions of persistent memory to be described in the
+ device-tree.
+
+ Select Y if unsure.
+
config NVDIMM_TEST_BUILD
tristate "Build the unit test core"
depends on m
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 29203f3..39fabc3 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_ND_BLK) += nd_blk.o
obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
+obj-$(CONFIG_KUP_PMEM) += nd_kup_pmem.o
nd_pmem-y := pmem.o
@@ -15,6 +16,8 @@ nd_blk-y := blk.o
nd_e820-y := e820.o
+nd_kup_pmem-y := kup_pmem.o
+
libnvdimm-y := core.o
libnvdimm-y += bus.o
libnvdimm-y += dimm_devs.o
diff --git a/drivers/nvdimm/kup_pmem.c b/drivers/nvdimm/kup_pmem.c
new file mode 100644
index 0000000..eadf95a
--- /dev/null
+++ b/drivers/nvdimm/kup_pmem.c
@@ -0,0 +1,106 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2020. All rights reserved.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2. See the file COPYING for more details.
+ *
+ * kup_pmem.c - kernel update support code.
+ * create a pmem device to store the processes information that is dumped
+ * when we want to kernel update.
+ */
+
+#include <linux/platform_device.h>
+#include <linux/memory_hotplug.h>
+#include <linux/libnvdimm.h>
+#include <linux/module.h>
+#include <asm/io.h>
+
+static const struct attribute_group *kup_pmem_attribute_groups[] = {
+ &nvdimm_bus_attribute_group,
+ NULL,
+};
+
+static const struct attribute_group *kup_pmem_region_attribute_groups[] = {
+ &nd_region_attribute_group,
+ &nd_device_attribute_group,
+ NULL,
+};
+
+static int kup_pmem_remove(struct platform_device *pdev)
+{
+ struct nvdimm_bus *nvdimm_bus = platform_get_drvdata(pdev);
+
+ nvdimm_bus_unregister(nvdimm_bus);
+
+ return 0;
+}
+
+static int kup_register_one(struct resource *res, void *data)
+{
+ struct nd_region_desc ndr_desc;
+ struct nvdimm_bus *nvdimm_bus = data;
+
+ memset(&ndr_desc, 0, sizeof(ndr_desc));
+ ndr_desc.res = res;
+ ndr_desc.attr_groups = kup_pmem_region_attribute_groups;
+ ndr_desc.numa_node = NUMA_NO_NODE;
+ set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
+ if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc))
+ return -ENXIO;
+ return 0;
+}
+
+static int kup_pmem_probe(struct platform_device *pdev)
+{
+ static struct nvdimm_bus_descriptor nd_desc;
+ struct device *dev = &pdev->dev;
+ struct nvdimm_bus *nvdimm_bus;
+ int rc = -ENXIO;
+
+ nd_desc.attr_groups = kup_pmem_attribute_groups;
+ nd_desc.provider_name = "kup_pmem";
+ nd_desc.module = THIS_MODULE;
+ nvdimm_bus = nvdimm_bus_register(dev, &nd_desc);
+ if (!nvdimm_bus)
+ goto err;
+ platform_set_drvdata(pdev, nvdimm_bus);
+
+ rc = walk_iomem_res_desc(IORES_DESC_KPMEM_DEV,
+ IORESOURCE_MEM, 0, -1, nvdimm_bus, kup_register_one);
+ if (rc)
+ goto err;
+
+ return 0;
+err:
+ nvdimm_bus_unregister(nvdimm_bus);
+ dev_err(dev, "kup_pmem: failed to register legacy persistent memory ranges\n");
+ return rc;
+}
+
+static struct platform_driver kup_pmem_driver = {
+ .probe = kup_pmem_probe,
+ .remove = kup_pmem_remove,
+ .driver = {
+ .name = "kup_pmem",
+ },
+};
+static struct platform_device *pdev;
+
+static __init int register_kup_pmem(void)
+{
+ platform_driver_register(&kup_pmem_driver);
+ pdev = platform_device_alloc("kup_pmem", -1);
+
+ return platform_device_add(pdev);
+}
+
+static __exit void unregister_kup_pmem(void)
+{
+ platform_device_del(pdev);
+ platform_driver_unregister(&kup_pmem_driver);
+}
+
+module_init(register_kup_pmem);
+module_exit(unregister_kup_pmem);
+MODULE_ALIAS("platform:kup_pmem*");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 5135d4b..bba36f2 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -139,6 +139,7 @@ enum {
IORES_DESC_DEVICE_PRIVATE_MEMORY = 6,
IORES_DESC_RESERVED = 7,
IORES_DESC_SOFT_RESERVED = 8,
+ IORES_DESC_KPMEM_DEV = 9,
};
/*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index db6ae4d..2b2a94c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -45,6 +45,10 @@ extern int sysctl_page_lock_unfairness;
void init_mm_internals(void);
+#ifdef CONFIG_KUP_PMEM_MEMORY
+extern struct resource pmem_res;
+#endif
+
#ifndef CONFIG_NEED_MULTIPLE_NODES /* Don't use mapnrs, do it properly */
extern unsigned long max_mapnr;
diff --git a/lib/Kconfig b/lib/Kconfig
index b46a9fd..ff5677c 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -689,3 +689,9 @@ config GENERIC_LIB_UCMPDI2
config PLDMFW
bool
default n
+
+config KUP_PMEM_MEMORY
+ bool "reserve memory for kup pmem to store image"
+ default y
+ help
+ Say y here to enable this feature
--
2.9.5
1 year, 4 months
[PATCH v3] arch/arc: add copy_user_page() to <asm/page.h> to fix build error on ARC
by Randy Dunlap
fs/dax.c uses copy_user_page() but ARC does not provide that interface,
resulting in a build error.
Provide copy_user_page() in <asm/page.h>.
../fs/dax.c: In function 'copy_cow_page_dax':
../fs/dax.c:702:2: error: implicit declaration of function 'copy_user_page'; did you mean 'copy_to_user_page'? [-Werror=implicit-function-declaration]
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vineet Gupta <vgupta(a)synopsys.com>
Cc: linux-snps-arc(a)lists.infradead.org
Cc: Dan Williams <dan.j.williams(a)intel.com>
#Acked-by: Vineet Gupta <vgupta(a)synopsys.com> # v1
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-nvdimm(a)lists.01.org
#Reviewed-by: Ira Weiny <ira.weiny(a)intel.com> # v2
---
v2: rebase, add more Cc:
v3: add copy_user_page() to arch/arc/include/asm/page.h
arch/arc/include/asm/page.h | 1 +
--- lnx-511-rc1.orig/arch/arc/include/asm/page.h
+++ lnx-511-rc1/arch/arc/include/asm/page.h
@@ -10,6 +10,7 @@
#ifndef __ASSEMBLY__
#define clear_page(paddr) memset((paddr), 0, PAGE_SIZE)
+#define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
#define copy_page(to, from) memcpy((to), (from), PAGE_SIZE)
struct vm_area_struct;
1 year, 4 months