[RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink
by Shiyang Ruan
This patchset is a try to resolve the shared 'page cache' problem for
fsdax.
In order to track multiple mappings and indexes on one page, I
introduced a dax-rmap rb-tree to manage the relationship. A dax entry
will be associated more than once if is shared. At the second time we
associate this entry, we create this rb-tree and store its root in
page->private(not used in fsdax). Insert (->mapping, ->index) when
dax_associate_entry() and delete it when dax_disassociate_entry().
We can iterate the dax-rmap rb-tree before any other operations on
mappings of files. Such as memory-failure and rmap.
Same as before, I borrowed and made some changes on Goldwyn's patchsets.
These patches makes up for the lack of CoW mechanism in fsdax.
The rests are dax & reflink support for xfs.
(Rebased to 5.7-rc2)
Shiyang Ruan (8):
fs/dax: Introduce dax-rmap btree for reflink
mm: add dax-rmap for memory-failure and rmap
fs/dax: Introduce dax_copy_edges() for COW
fs/dax: copy data before write
fs/dax: replace mmap entry in case of CoW
fs/dax: dedup file range to use a compare function
fs/xfs: handle CoW for fsdax write() path
fs/xfs: support dedupe for fsdax
fs/dax.c | 343 +++++++++++++++++++++++++++++++++++++----
fs/ocfs2/file.c | 2 +-
fs/read_write.c | 11 +-
fs/xfs/xfs_bmap_util.c | 6 +-
fs/xfs/xfs_file.c | 10 +-
fs/xfs/xfs_iomap.c | 3 +-
fs/xfs/xfs_iops.c | 11 +-
fs/xfs/xfs_reflink.c | 79 ++++++----
include/linux/dax.h | 11 ++
include/linux/fs.h | 9 +-
mm/memory-failure.c | 63 ++++++--
mm/rmap.c | 54 +++++--
12 files changed, 498 insertions(+), 104 deletions(-)
--
2.26.2
2 years
[ndctl PATCH v5 0/6] Add support for reporting papr nvdimm health
by Vaibhav Jain
Changes since v4 [1]:
* Updated proposed changes to remove usage of term 'SCM' due to
ambiguity with 'PMEM' and 'NVDIMM'. [ Dan Williams ]
* Replaced the usage of term 'SCM' with 'PMEM' in most contexts.
[ Aneesh ]
* Updates to various newly introduced identifiers in 'papr.c'
removing the 'SCM' prefix from their names.
* Renamed NVDIMM_FAMILY_PAPR_SCM to NVDIMM_FAMILY_PAPR
* Renamed PAPR_SCM_PDSM_HEALTH PAPR_PDSM_HEALTH
* Renamed 'papr_scm.c' to 'papr.c'
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
[1] https://lore.kernel.org/linux-nvdimm/20200527044737.40615-1-vaibhav@linux...
---
This patch-set proposes changes to libndctl to add support for reporting
health for nvdimms that support the PAPR standard[2]. The standard defines
machenism (HCALL) through which a guest kernel can query and fetch health
and performance stats of an nvdimm attached to the hypervisor[3]. Until
now 'ndctl' was unable to report these stats for papr_scm dimms on PPC64
guests due to absence of ACPI/NFIT, a limitation which this patch-set tries
to address.
The patch-set introduces support for the new PAPR PDSM family
defined at [4] & [5] via a new dimm-ops named
'papr_dimm_ops'. Infrastructure to probe and distinguish papr-scm
dimms from other dimm families that may support ACPI/NFIT is
implemented by updating the 'struct ndctl_dimm' initialization
routines to bifurcate based on the nvdimm type. We also introduce two
new dimm-ops member for handling initialization of dimm specific data
for specific DSM families.
These changes coupled with proposed kernel changes located at Ref[1] should
provide a way for the user to retrieve NVDIMM health status using ndtcl for
pseries guests. Below is a sample output using proposed kernel + ndctl
changes:
# ndctl list -DH
[
{
"dev":"nmem0",
"flag_smart_event":true,
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Structure of the patchset
=========================
We start with a re-factoring patch that splits the 'add_dimm()' function
into two functions one that take care of allocating and initializing
'struct ndctl_dimm' and another that takes care of initializing nfit
specific dimm attributes.
Patch-2 introduces probe function of papr nvdimms and assigning
'papr_dimm_ops' defined in 'papr.c' to 'dimm->ops' if
needed. The patch also code to parse the dimm flags specific to
papr nvdimms
Patch-3 introduces new dimm ops 'dimm_init()' & 'dimm_uninit()' to handle
DSM family specific initialization of 'struct ndctl_dimm'.
Patches-4,5 implements scaffolding to add support for PAPR PDSM
requests and pull in their definitions from the kernel.
Finally Patch-6 add support for issuing and handling the result of
'struct ndctl_cmd' to request dimm health stats from papr_scm kernel module
and returning appropriate health status to libndctl for reporting.
References
==========
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] "Hypercall Op-codes (hcalls)"
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_...
[4] "powerpc/papr_scm: Add support for reporting nvdimm health"
https://lore.kernel.org/linux-nvdimm/20200529214719.223344-1-vaibhav@linu...
[5] "ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods"
https://lore.kernel.org/linux-nvdimm/20200529214719.223344-5-vaibhav@linu...
Vaibhav Jain (6):
libndctl: Refactor out add_dimm() to handle NFIT specific init
libncdtl: Add initial support for NVDIMM_FAMILY_PAPR nvdimm family
libndctl: Introduce new dimm-ops dimm_init() & dimm_uninit()
libndctl,papr_scm: Add definitions for PAPR nvdimm specific methods
papr: Add scaffolding to issue and handle PDSM requests
libndctl,papr_scm: Implement support for PAPR_PDSM_HEALTH
ndctl/lib/Makefile.am | 1 +
ndctl/lib/libndctl.c | 260 ++++++++++++++++++++++++++----------
ndctl/lib/papr.c | 301 ++++++++++++++++++++++++++++++++++++++++++
ndctl/lib/papr_pdsm.h | 175 ++++++++++++++++++++++++
ndctl/lib/private.h | 9 ++
ndctl/libndctl.h | 1 +
ndctl/ndctl.h | 1 +
7 files changed, 678 insertions(+), 70 deletions(-)
create mode 100644 ndctl/lib/papr.c
create mode 100644 ndctl/lib/papr_pdsm.h
--
2.26.2
2 years
[PATCH] drivers/dax/bus: Use kobj_to_dev() API
by Shuai He
Use kobj_to_dev() API instead of container_of().
Signed-off-by: Shuai He <hexiaoshuaishuai(a)gmail.com>
---
drivers/dax/bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index df238c8..24625d2 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -331,7 +331,7 @@ static DEVICE_ATTR_RO(numa_node);
static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct dev_dax *dev_dax = to_dev_dax(dev);
if (a == &dev_attr_target_node.attr && dev_dax_target_node(dev_dax) < 0)
--
2.7.4
2 years
[PATCH] Documentation: fixes to the maintainer-entry-profile template
by Randy Dunlap
From: Randy Dunlap <rdunlap(a)infradead.org>
Do some wordsmithing and copy editing on the maintainer-entry-profile
profile (template, guide):
- fix punctuation
- fix some wording
- use "-rc" consistently
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: linux-nvdimm(a)lists.01.org
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: linux-doc(a)vger.kernel.org
---
Documentation/maintainer/maintainer-entry-profile.rst | 12 +++++-----
1 file changed, 6 insertions(+), 6 deletions(-)
--- linux-next-20200526.orig/Documentation/maintainer/maintainer-entry-profile.rst
+++ linux-next-20200526/Documentation/maintainer/maintainer-entry-profile.rst
@@ -7,7 +7,7 @@ The Maintainer Entry Profile supplements
(submitting-patches, submitting drivers...) with
subsystem/device-driver-local customs as well as details about the patch
submission life-cycle. A contributor uses this document to level set
-their expectations and avoid common mistakes, maintainers may use these
+their expectations and avoid common mistakes; maintainers may use these
profiles to look across subsystems for opportunities to converge on
common practices.
@@ -26,7 +26,7 @@ Example questions to consider:
- Does the subsystem have a patchwork instance? Are patchwork state
changes notified?
- Any bots or CI infrastructure that watches the list, or automated
- testing feedback that the subsystem gates acceptance?
+ testing feedback that the subsystem uses to gate acceptance?
- Git branches that are pulled into -next?
- What branch should contributors submit against?
- Links to any other Maintainer Entry Profiles? For example a
@@ -54,8 +54,8 @@ One of the common misunderstandings of s
sent at any time before the merge window closes and can still be
considered for the next -rc1. The reality is that most patches need to
be settled in soaking in linux-next in advance of the merge window
-opening. Clarify for the submitter the key dates (in terms rc release
-week) that patches might considered for merging and when patches need to
+opening. Clarify for the submitter the key dates (in terms of -rc release
+week) that patches might be considered for merging and when patches need to
wait for the next -rc. At a minimum:
- Last -rc for new feature submissions:
@@ -70,8 +70,8 @@ wait for the next -rc. At a minimum:
- Last -rc to merge features: Deadline for merge decisions
Indicate to contributors the point at which an as yet un-applied patch
set will need to wait for the NEXT+1 merge window. Of course there is no
- obligation to ever except any given patchset, but if the review has not
- concluded by this point the expectation the contributor should wait and
+ obligation to ever accept any given patchset, but if the review has not
+ concluded by this point the expectation is the contributor should wait and
resubmit for the following merge window.
Optional:
2 years
[PATCH v8 0/5] powerpc/papr_scm: Add support for reporting nvdimm health
by Vaibhav Jain
Changes since v7 [1]:
* Addressed various review comments from Aneesh, Ira and Mpe.
* Removed the 'payload_offset' field from 'struct nd_pdsm_cmd_pkg' and
replaced it with some reserved fields [ Aneesh ].
* Updated the doc and description for patch that fetches dimm health
information from PHYP clarifying bit-ordering [ Mpe and Ira ].
* Updated the patch title & description for patch exporting
'seq_buf_printf'. [ Christoph Hellwig ]
* Fix types of various newly introduced vars in papr_scm.c [ Ira ].
* Fixed a typo in 'papr_scm_pdsm.h' [ Ira ]
[1] https://lore.kernel.org/linux-nvdimm/20200519190058.257981-1-vaibhav@linu...
---
The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3]. Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.
To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3]. This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.
These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.
Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:
# ndctl list -DH
[
{
"dev":"nmem0",
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:
# ndctl list -d nmem0 -H
[
{
"dev":"nmem0",
"health":{
"health_state":"ok",
"shutdown_state":"clean"
}
}
]
PAPR NVDIMM-Specific-Methods(PDSM)
==================================
PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_scm_pdsm.h'. Support for
more PDSMs will be added in future.
Structure of the patch-set
==========================
The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.
Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.
Fourth patches deal with implementing support for servicing PDSM
commands in papr_scm module.
Finally the last patch implements support for servicing PDSM
'PAPR_SCM_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.
References:
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v8
---
Vaibhav Jain (5):
powerpc: Document details on H_SCM_HEALTH hcall
seq_buf: Export seq_buf_printf
powerpc/papr_scm: Fetch nvdimm health information from PHYP
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH
Documentation/ABI/testing/sysfs-bus-papr-scm | 27 ++
Documentation/powerpc/papr_hcalls.rst | 45 ++-
arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 175 +++++++++
arch/powerpc/platforms/pseries/papr_scm.c | 363 +++++++++++++++++-
include/uapi/linux/ndctl.h | 1 +
lib/seq_buf.c | 1 +
6 files changed, 599 insertions(+), 13 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm
create mode 100644 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
--
2.26.2
2 years
[5.4-stable PATCH 0/7] libnvdimm: Cross-arch compatible namespace
alignment
by Dan Williams
Hello stable team,
These patches have been shipping in mainline since v5.7-rc1 with no
reported issues. They address long standing problems in libnvdimm's
handling of namespace provisioning relative to alignment constraints
including crashes trying to even load the driver on some PowerPC
configurations.
I did fold one build fix [1] into "libnvdimm/region: Introduce an 'align'
attribute" so as to not convey the bisection breakage to -stable.
Please consider them for v5.4-stable. They do pass the latest
version of the ndctl unit tests.
[1]: 04ff4863e126 libnvdimm/region: Fix build error
---
Original cover letter for the series:
Aneesh reports that PowerPC requires 16MiB alignment for the address
range passed to devm_memremap_pages(), and Jeff reports that it is
possible to create a misaligned namespace which blocks future namespace
creation in that region. Both of these issues require namespace
alignment to be managed at the region level rather than padding at the
namespace level which has been a broken approach to date.
Introduce memremap_compat_align() to indicate the hard requirements of
an arch's memremap_pages() implementation. Use the maximum known
memremap_compat_align() to set the default namespace alignment for
libnvdimm. Consult that alignment when allocating free space. Finally,
allow the default region alignment to be overridden to maintain the same
namespace creation capability as previous kernels (modulo dax operation
not being supported with a non-zero start_pad).
---
Aneesh Kumar K.V (1):
libnvdimm/pfn_dev: Don't clear device memmap area during generic namespace probe
Dan Williams (6):
mm/memremap_pages: Kill unused __devm_memremap_pages()
mm/memremap_pages: Introduce memremap_compat_align()
libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid
libnvdimm/namespace: Enforce memremap_compat_align()
libnvdimm/region: Introduce NDD_LABELING
libnvdimm/region: Introduce an 'align' attribute
arch/powerpc/Kconfig | 1
arch/powerpc/mm/ioremap.c | 21 +++++
arch/powerpc/platforms/pseries/papr_scm.c | 2
drivers/acpi/nfit/core.c | 4 +
drivers/nvdimm/dimm.c | 2
drivers/nvdimm/dimm_devs.c | 95 +++++++++++++++++----
drivers/nvdimm/namespace_devs.c | 28 ++++++
drivers/nvdimm/nd.h | 3 -
drivers/nvdimm/pfn.h | 12 +++
drivers/nvdimm/pfn_devs.c | 48 +++++++++--
drivers/nvdimm/region_devs.c | 130 ++++++++++++++++++++++++++---
include/linux/io.h | 2
include/linux/libnvdimm.h | 2
include/linux/memremap.h | 8 ++
include/linux/mmzone.h | 1
lib/Kconfig | 3 +
mm/memremap.c | 23 +++++
17 files changed, 335 insertions(+), 50 deletions(-)
base-commit: 1cdaf895c99d319c0007d0b62818cf85fc4b087f
2 years
[PATCH v4 0/8] Support new pmem flush and sync instructions for POWER
by Aneesh Kumar K.V
This patch series enables the usage os new pmem flush and sync instructions on POWER
architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
that can be used to write modified locations back to persistent storage. Additionally,
POWER10 also introduce phwsync and plwsync which can be used to establish order of these
writes to persistent storage.
This series exposes these instructions to the rest of the kernel. The existing
dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions
are added as a variant of the old ones that old hardware won't differentiate.
On POWER10, pmem devices will be represented by a different device tree compat
strings. This ensures that older kernels won't initialize pmem devices on POWER10.
Changes from V3:
* Add new compat string to be used for the device.
* Use arch_pmem_flush_barrier() in dm-writecache.
Aneesh Kumar K.V (8):
powerpc/pmem: Restrict papr_scm to P8 and above.
powerpc/pmem: Add new instructions for persistent storage and sync
powerpc/pmem: Add flush routines using new pmem store and sync
instruction
libnvdimm/nvdimm/flush: Allow architecture to override the flush
barrier
powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
instruction.
powerpc/pmem: Avoid the barrier in flush routines
powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
flush functions.
powerpc/pmem: Initialize pmem device on newer hardware
arch/powerpc/include/asm/cacheflush.h | 9 +++++
arch/powerpc/include/asm/ppc-opcode.h | 12 ++++++
arch/powerpc/lib/pmem.c | 46 +++++++++++++++++++++--
arch/powerpc/platforms/pseries/papr_scm.c | 14 +++++++
arch/powerpc/platforms/pseries/pmem.c | 6 +++
drivers/md/dm-writecache.c | 2 +-
drivers/nvdimm/of_pmem.c | 1 +
drivers/nvdimm/region_devs.c | 8 ++--
include/linux/libnvdimm.h | 4 ++
9 files changed, 93 insertions(+), 9 deletions(-)
--
2.26.2
2 years
[PATCH v3 1/7] powerpc/pmem: Restrict papr_scm to P8 and above.
by Aneesh Kumar K.V
The PAPR based virtualized persistent memory devices are only supported on
POWER9 and above. In the followup patch, the kernel will switch the persistent
memory cache flush functions to use a new `dcbf` variant instruction. The new
instructions even though added in ISA 3.1 works even on P8 and P9 because these
are implemented as a variant of existing `dcbf` and `hwsync` and on P8 and
above behaves as such.
Considering these devices are only supported on P8 and above, update the driver
to prevent a P7-compat guest from using persistent memory devices.
We don't update of_pmem driver with the same condition, because, on bare-metal,
the firmware enables pmem support only on P9 and above. There the kernel depends
on OPAL firmware to restrict exposing persistent memory related device tree
entries on older hardware. of_pmem.ko is written without any arch dependency and
we don't want to add ppc64 specific cpu feature check in of_pmem driver.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
---
arch/powerpc/platforms/pseries/pmem.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/pmem.c b/arch/powerpc/platforms/pseries/pmem.c
index f860a897a9e0..2347e1038f58 100644
--- a/arch/powerpc/platforms/pseries/pmem.c
+++ b/arch/powerpc/platforms/pseries/pmem.c
@@ -147,6 +147,12 @@ const struct of_device_id drc_pmem_match[] = {
static int pseries_pmem_init(void)
{
+ /*
+ * Only supported on POWER8 and above.
+ */
+ if (!cpu_has_feature(CPU_FTR_ARCH_207S))
+ return 0;
+
pmem_node = of_find_node_by_type(NULL, "ibm,persistent-memory");
if (!pmem_node)
return 0;
--
2.26.2
2 years, 1 month
[PATCH v9 0/5] powerpc/papr_scm: Add support for reporting nvdimm health
by Vaibhav Jain
Changes since v8 [1]:
* Updated proposed changes to remove usage of term 'SCM' due to
ambiguity with 'PMEM' and 'NVDIMM'. [ Dan Williams ]
* Replaced the usage of term 'SCM' with 'PMEM' in most contexts.
[ Aneesh ]
* Renamed NVDIMM health defines from PAPR_SCM_DIMM_* to PAPR_PMEM_* .
* Updates to various newly introduced identifiers in 'papr_scm.c'
removing the 'SCM' prefix from their names.
* Renamed NVDIMM_FAMILY_PAPR_SCM to NVDIMM_FAMILY_PAPR
* Renamed PAPR_SCM_PDSM_HEALTH PAPR_PDSM_HEALTH
* Renamed uapi header 'papr_scm_pdsm.h' to 'papr_pdsm.h'.
* Renamed sysfs ABI document from sysfs-bus-papr-scm to
sysfs-bus-papr-pmem.
* No behavioural changes from v8 [1].
[1] https://lore.kernel.org/linux-nvdimm/20200527041244.37821-1-vaibhav@linux...
---
The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3]. Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.
To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3]. This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.
These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.
Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:
# ndctl list -DH
[
{
"dev":"nmem0",
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:
# ndctl list -d nmem0 -H
[
{
"dev":"nmem0",
"health":{
"health_state":"ok",
"shutdown_state":"clean"
}
}
]
PAPR NVDIMM-Specific-Methods(PDSM)
==================================
PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.
Structure of the patch-set
==========================
The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.
Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.
Fourth patches deal with implementing support for servicing PDSM
commands in papr_scm module.
Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.
References:
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v9
---
Vaibhav Jain (5):
powerpc: Document details on H_SCM_HEALTH hcall
seq_buf: Export seq_buf_printf
powerpc/papr_scm: Fetch nvdimm health information from PHYP
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 ++
Documentation/powerpc/papr_hcalls.rst | 46 ++-
arch/powerpc/include/uapi/asm/papr_pdsm.h | 175 +++++++++
arch/powerpc/platforms/pseries/papr_scm.c | 363 +++++++++++++++++-
include/uapi/linux/ndctl.h | 1 +
lib/seq_buf.c | 1 +
6 files changed, 600 insertions(+), 13 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
--
2.26.2
2 years, 1 month
[PATCH v4 0/2] Renovate memcpy_mcsafe with copy_mc_to_{user, kernel}
by Dan Williams
Changes since v3 [1]:
- Drop extern for new function declarations (Michael)
- Rename memcpy_mcsafe_64.S to copy_mc_64.S instead of copy_mc.S and
related fixups (Michael)
- Add a new symlink
(tools/testing/selftests/powerpc/copyloops/copy_mc_64.S) to the new
copy_mc_64.S to fix selftest build breakage (Michael)
- Drop one instance of copy_safe() that survived from v2 of the patchset
(Vivek)
- Fix 32-bit x86 build breakage (kbuild robot)
- Kill off rather than rename tools/arch/x86/include/asm/mcsafe_test.h
since perf is no longer burden with dealing with the copy_mc_generic()
implementation.
- Build success notification received for revised set (kbuild robot)
[1]: http://lore.kernel.org/r/158992635164.403910.2616621400995359522.stgit@dw...
---
The primary motivation to go touch memcpy_mcsafe() is that the existing
benefit of doing slow "handle with care" copies is obviated on newer
CPUs. With that concern lifted it also obviates the need to continue to
update the MCA-recovery capability detection code currently gated by
"mcsafe_key". Now the old "mcsafe_key" opt-in to perform the copy with
concerns for recovery fragility can instead be made an opt-out from the
default fast copy implementation (enable_copy_mc_fragile()).
The discussion with Linus on the first iteration of this patch
identified that memcpy_mcsafe() was misnamed relative to its usage. The
new names copy_mc_to_user() and copy_mc_to_kernel() clearly indicate the
intended use case and lets the architecture organize the implementation
accordingly.
For both powerpc and x86 a copy_mc_generic() implementation is added as
the backend for these interfaces.
Patches are relative to tip/master.
---
Dan Williams (2):
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user,kernel}()
x86/copy_mc: Introduce copy_mc_generic()
arch/powerpc/Kconfig | 2
arch/powerpc/include/asm/string.h | 2
arch/powerpc/include/asm/uaccess.h | 40 +++--
arch/powerpc/lib/Makefile | 2
arch/powerpc/lib/copy_mc_64.S | 4
arch/x86/Kconfig | 2
arch/x86/Kconfig.debug | 2
arch/x86/include/asm/copy_mc_test.h | 75 +++++++++
arch/x86/include/asm/mcsafe_test.h | 75 ---------
arch/x86/include/asm/string_64.h | 32 ----
arch/x86/include/asm/uaccess.h | 21 +++
arch/x86/include/asm/uaccess_64.h | 20 --
arch/x86/kernel/cpu/mce/core.c | 8 -
arch/x86/kernel/quirks.c | 9 -
arch/x86/lib/Makefile | 1
arch/x86/lib/copy_mc.c | 64 ++++++++
arch/x86/lib/copy_mc_64.S | 165 ++++++++++++++++++++
arch/x86/lib/memcpy_64.S | 115 --------------
arch/x86/lib/usercopy_64.c | 21 ---
drivers/md/dm-writecache.c | 15 +-
drivers/nvdimm/claim.c | 2
drivers/nvdimm/pmem.c | 6 -
include/linux/string.h | 9 -
include/linux/uaccess.h | 9 +
include/linux/uio.h | 10 +
lib/Kconfig | 7 +
lib/iov_iter.c | 43 +++--
tools/arch/x86/include/asm/mcsafe_test.h | 13 --
tools/arch/x86/lib/memcpy_64.S | 115 --------------
tools/objtool/check.c | 5 -
tools/perf/bench/Build | 1
tools/perf/bench/mem-memcpy-x86-64-lib.c | 24 ---
tools/testing/nvdimm/test/nfit.c | 48 +++---
.../testing/selftests/powerpc/copyloops/.gitignore | 2
tools/testing/selftests/powerpc/copyloops/Makefile | 6 -
.../selftests/powerpc/copyloops/copy_mc_64.S | 1
.../selftests/powerpc/copyloops/memcpy_mcsafe_64.S | 1
37 files changed, 451 insertions(+), 526 deletions(-)
rename arch/powerpc/lib/{memcpy_mcsafe_64.S => copy_mc_64.S} (98%)
create mode 100644 arch/x86/include/asm/copy_mc_test.h
delete mode 100644 arch/x86/include/asm/mcsafe_test.h
create mode 100644 arch/x86/lib/copy_mc.c
create mode 100644 arch/x86/lib/copy_mc_64.S
delete mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
delete mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c
create mode 120000 tools/testing/selftests/powerpc/copyloops/copy_mc_64.S
delete mode 120000 tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
base-commit: bba413deb1065f1291cb1f366247513f11215520
2 years, 1 month