[PATCH v3 1/2] nfit, mce: only handle uncorrectable machine checks
by Vishal Verma
The mce handler for 'nfit' devices is called for memory errors on a
Non-Volatile DIMM, and adds the error location to a 'badblocks' list.
This list is used by the various NVDIMM drivers to avoid consuming known
poison locations during IO.
The mce handler gets called for both corrected and uncorrectable errors.
Until now, both kinds of errors have been added to the badblocks list.
However, corrected memory errors indicate that the problem has already
been fixed by hardware, and the resulting interrupt is merely a
notification to Linux. As far as future accesses to that location are
concerned, it is perfectly fine to use, and thus doesn't need to be
included in the above badblocks list.
Add a check in the nfit mce handler to filter out corrected mce events,
and only process uncorrectable errors.
Reported-by: Omar Avelar <omar.avelar(a)intel.com>
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Cc: stable(a)vger.kernel.org
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
drivers/acpi/nfit/mce.c | 4 ++--
3 files changed, 5 insertions(+), 3 deletions(-)
v3: Unchanged from v2
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a17107594c8..3111b3cee2ee 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 953b3ce92dcc..27015948bc41 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -534,7 +534,7 @@ bool mce_is_memory_error(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
@@ -544,6 +544,7 @@ static bool mce_is_correctable(struct mce *m)
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e9626bf6ca29..7a51707f87e9 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
return NOTIFY_DONE;
/*
--
2.17.1
1 year, 11 months
[PATCH 0/8] Introduce a device-dax bus-based device-model
by Dan Williams
Prompted by the review of "[PATCH 0/9] Allow persistent memory to be
used like normal RAM" [1] introduce a new bus / device-driver-model
for device-dax.
Currently device-dax instances result from attaching an nvdimm namespace
device to the dax_pmem driver. These instances are registered with the
/sys/class/dax sub-system. With the expectation that platforms will
describe performance differentiated memory [2] for ranges other than
persistent memory (pmem) a new device-model is needed.
Arrange for dax_pmem to be one of potentially several drivers that know
how to discover differentiated memory and register a device instance on
the dax bus. The expectation is that, by default, this device is
consumed by the typical device-dax driver that will expose the range
through a /dev/daxX.Y character device. Optionally other drivers can
consume the dax device instance. For example, the kmem driver [1] can
attach to device-dax device instance to hot-add the related memory range
to the core page-allocator.
Going forward, provider drivers outside of dax_pmem can be created to
register other memories with unique performance properties.
Since /sys/class/dax is a released ABI, a compat driver is provided so
that distros can opt-in to the new bus based ABI. The /sys/class/dax
interface is then deprecated and scheduled to be removed.
[1]: https://lkml.org/lkml/2018/10/23/9
[2]: Section 5.2.27 Heterogeneous Memory Attribute Table (HMAT)
http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf
---
Dan Williams (8):
device-dax: Kill dax_region ida
device-dax: Kill dax_region base
device-dax: Remove multi-resource infrastructure
device-dax: Start defining a dax bus model
device-dax: Introduce bus + driver model
device-dax: Move resource pinning+mapping into the common driver
device-dax: Add support for a dax override driver
device-dax: Add /sys/class/dax backwards compatibility
Documentation/ABI/obsolete/sysfs-class-dax | 22 +
drivers/dax/Kconfig | 12 +
drivers/dax/Makefile | 5
drivers/dax/bus.c | 449 ++++++++++++++++++++++++++++
drivers/dax/bus.h | 60 ++++
drivers/dax/dax-private.h | 30 +-
drivers/dax/dax.h | 18 -
drivers/dax/device-dax.h | 25 --
drivers/dax/device.c | 365 +++++------------------
drivers/dax/pmem.c | 161 ----------
drivers/dax/pmem/Makefile | 7
drivers/dax/pmem/compat.c | 73 +++++
drivers/dax/pmem/core.c | 69 ++++
drivers/dax/pmem/pmem.c | 40 ++
drivers/dax/super.c | 41 ++-
tools/testing/nvdimm/Kbuild | 7
tools/testing/nvdimm/dax-dev.c | 16 -
17 files changed, 880 insertions(+), 520 deletions(-)
create mode 100644 Documentation/ABI/obsolete/sysfs-class-dax
create mode 100644 drivers/dax/bus.c
create mode 100644 drivers/dax/bus.h
delete mode 100644 drivers/dax/dax.h
delete mode 100644 drivers/dax/device-dax.h
delete mode 100644 drivers/dax/pmem.c
create mode 100644 drivers/dax/pmem/Makefile
create mode 100644 drivers/dax/pmem/compat.c
create mode 100644 drivers/dax/pmem/core.c
create mode 100644 drivers/dax/pmem/pmem.c
2 years
[PATCH V2 1/1] device-dax: check for vma range while dax_mmap.
by Zhang Yi
This patch prevents a user mapping an illegal vma range that is larger
than a dax device physical resource.
When qemu maps the dax device for virtual nvdimm's backend device, the
v-nvdimm label area is defined at the end of mapped range. By using an
illegal size that exceeds the range of the device dax, it will trigger a
fault with qemu.
Signed-off-by: Zhang Yi <yi.z.zhang(a)linux.intel.com>
---
drivers/dax/device.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 108c37f..6fe8c30 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -177,6 +177,33 @@ static const struct attribute_group *dax_attribute_groups[] = {
NULL,
};
+static int check_vma_range(struct dev_dax *dev_dax, struct vm_area_struct *vma,
+ const char *func)
+{
+ struct device *dev = &dev_dax->dev;
+ struct resource *res;
+ unsigned long size;
+ int ret, i;
+
+ if (!dax_alive(dev_dax->dax_dev))
+ return -ENXIO;
+
+ size = vma->vm_end - vma->vm_start + (vma->vm_pgoff << PAGE_SHIFT);
+ ret = -EINVAL;
+ for (i = 0; i < dev_dax->num_resources; i++) {
+ res = &dev_dax->res[i];
+ if (size > resource_size(res)) {
+ dev_info_ratelimited(dev,
+ "%s: %s: fail, vma range overflow\n",
+ current->comm, func);
+ ret = -EINVAL;
+ continue;
+ } else
+ return 0;
+ }
+ return ret;
+}
+
static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
const char *func)
{
@@ -469,6 +496,8 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
*/
id = dax_read_lock();
rc = check_vma(dev_dax, vma, __func__);
+ if (!rc)
+ rc = check_vma_range(dev_dax, vma, __func__);
dax_read_unlock(id);
if (rc)
return rc;
--
2.7.4
2 years, 1 month
Snapshot target and DAX-capable devices
by Jan Kara
Hi,
I've been analyzing why fstest generic/081 fails when the backing device is
capable of DAX. The problem boils down to the failure of:
lvm vgcreate -f vg0 /dev/pmem0
lvm lvcreate -L 128M -n lv0 vg0
lvm lvcreate -s -L 4M -n snap0 vg0/lv0
The last command fails like:
device-mapper: reload ioctl on (253:0) failed: Invalid argument
Failed to lock logical volume vg0/lv0.
Aborting. Manual intervention required.
And the core of the problem is that volume vg0/lv0 is originally of
DM_TYPE_DAX_BIO_BASED type but when the snapshot gets created, we try to
switch it to DM_TYPE_BIO_BASED because now the device stops supporting DAX.
The problem seems to be introduced by Ross' commit dbc626597 "dm: prevent
DAX mounts if not supported".
The question is whether / how this should be fixed. The current inability
to create snapshots of DAX-capable devices looks weird and the cryptic
failure makes it even worse (it took me quite a while to understand what is
failing and why). OTOH I see the rationale behind Ross' change as well.
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
2 years, 1 month
[PATCH 0/9] Allow persistent memory to be used like normal RAM
by Dave Hansen
Persistent memory is cool. But, currently, you have to rewrite
your applications to use it. Wouldn't it be cool if you could
just have it show up in your system like normal RAM and get to
it like a slow blob of memory? Well... have I got the patch
series for you!
This series adds a new "driver" to which pmem devices can be
attached. Once attached, the memory "owned" by the device is
hot-added to the kernel and managed like any other memory. On
systems with an HMAT (a new ACPI table), each socket (roughly)
will have a separate NUMA node for its persistent memory so
this newly-added memory can be selected by its unique NUMA
node.
This is highly RFC, and I really want the feedback from the
nvdimm/pmem folks about whether this is a viable long-term
perversion of their code and device mode. It's insufficiently
documented and probably not bisectable either.
Todo:
1. The device re-binding hacks are ham-fisted at best. We
need a better way of doing this, especially so the kmem
driver does not get in the way of normal pmem devices.
2. When the device has no proper node, we default it to
NUMA node 0. Is that OK?
3. We muck with the 'struct resource' code quite a bit. It
definitely needs a once-over from folks more familiar
with it than I.
4. Is there a better way to do this than starting with a
copy of pmem.c?
Here's how I set up a system to test this thing:
1. Boot qemu with lots of memory: "-m 4096", for instance
2. Reserve 512MB of physical memory. Reserving a spot a 2GB
physical seems to work: memmap=512M!0x0000000080000000
This will end up looking like a pmem device at boot.
3. When booted, convert fsdax device to "device dax":
ndctl create-namespace -fe namespace0.0 -m dax
4. In the background, the kmem driver will probably bind to the
new device.
5. Now, online the new memory sections. Perhaps:
grep ^MemTotal /proc/meminfo
for f in `grep -vl online /sys/devices/system/memory/*/state`; do
echo $f: `cat $f`
echo online > $f
grep ^MemTotal /proc/meminfo
done
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Ross Zwisler <zwisler(a)kernel.org>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: linux-nvdimm(a)lists.01.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Fengguang Wu <fengguang.wu(a)intel.com>
2 years, 1 month
[RFC v2 00/14] kunit: introduce KUnit, the Linux kernel unit testing framework
by Brendan Higgins
This patch set proposes KUnit, a lightweight unit testing and mocking
framework for the Linux kernel.
Unlike Autotest and kselftest, KUnit is a true unit testing framework;
it does not require installing the kernel on a test machine or in a VM
and does not require tests to be written in userspace running on a host
kernel. Additionally, KUnit is fast: From invocation to completion KUnit
can run several dozen tests in under a second. Currently, the entire
KUnit test suite for KUnit runs in under a second from the initial
invocation (build time excluded).
KUnit is heavily inspired by JUnit, Python's unittest.mock, and
Googletest/Googlemock for C++. KUnit provides facilities for defining
unit test cases, grouping related test cases into test suites, providing
common infrastructure for running tests, mocking, spying, and much more.
## What's so special about unit testing?
A unit test is supposed to test a single unit of code in isolation,
hence the name. There should be no dependencies outside the control of
the test; this means no external dependencies, which makes tests orders
of magnitudes faster. Likewise, since there are no external dependencies,
there are no hoops to jump through to run the tests. Additionally, this
makes unit tests deterministic: a failing unit test always indicates a
problem. Finally, because unit tests necessarily have finer granularity,
they are able to test all code paths easily solving the classic problem
of difficulty in exercising error handling code.
## Is KUnit trying to replace other testing frameworks for the kernel?
No. Most existing tests for the Linux kernel are end-to-end tests, which
have their place. A well tested system has lots of unit tests, a
reasonable number of integration tests, and some end-to-end tests. KUnit
is just trying to address the unit test space which is currently not
being addressed.
## More information on KUnit
There is a bunch of documentation near the end of this patch set that
describes how to use KUnit and best practices for writing unit tests.
For convenience I am hosting the compiled docs here:
https://google.github.io/kunit-docs/third_party/kernel/docs/
## Changes Since Last Version
- Updated patchset to apply cleanly on 4.19.
- Stripped down patchset to focus on just the core features (I dropped
mocking, spying, and the MMIO stuff for now; you can find these
patches here: https://kunit-review.googlesource.com/c/linux/+/1132),
as suggested by Rob.
- Cleaned up some of the commit messages and tweaked commit order a
bit based on suggestions.
--
2.19.1.568.g152ad8e336-goog
2 years, 1 month
[ndctl PATCH] ndctl: recover from failed namespace creation
by oceanhehy@gmail.com
From: Ocean He <hehy1(a)lenovo.com>
When namespace creation failure occurs, the consumed namespace (seed or 0th
idle) and pfn/dax seed would block next namespace creation. A recovery is
needed to handle this type failure.
A symptom example (section size is 128MB) based on kernel 4.19-rc2 and
ndctl v62:
# ndctl create-namespace -r region1 -s 100m -t pmem -m fsdax
{
"dev":"namespace1.0",
"mode":"fsdax",
"map":"dev",
"size":"96.00 MiB (100.66 MB)",
"uuid":"ef9a0556-a610-40b5-8c71-43991765a2cc",
"raw_uuid":"177b22e2-b7e8-482f-a063-2b8de876d979",
"sector_size":512,
"blockdev":"pmem1",
"numa_node":1
}
# ndctl create-namespace -r region1 -s 100m -t pmem -m fsdax
libndctl: ndctl_pfn_enable: pfn1.1: failed to enable
Error: namespace1.1: failed to enable
failed to create namespace: No such device or address
# ndctl destroy-namespace namespace1.0 -f
destroyed 1 namespace
# ndctl create-namespace -r region1 -s 128m -t pmem -m fsdax
failed to create namespace: Device or resource busy
Signed-off-by: Ocean He <hehy1(a)lenovo.com>
---
Additional information:
A kernel patch to fix this has been reviewed by Dan Williams, and he prefers
to handle it in ndctl directly.
https://www.spinics.net/lists/kernel/msg2901465.html
ndctl/namespace.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index 510553c..76ee2ed 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -393,6 +393,8 @@ static int setup_namespace(struct ndctl_region *region,
try(ndctl_pfn, set_align, pfn, p->align);
try(ndctl_pfn, set_namespace, pfn, ndns);
rc = ndctl_pfn_enable(pfn);
+ if (rc)
+ ndctl_pfn_set_namespace(pfn, NULL);
} else if (p->mode == NDCTL_NS_MODE_DAX) {
struct ndctl_dax *dax = ndctl_region_get_dax_seed(region);
@@ -402,6 +404,8 @@ static int setup_namespace(struct ndctl_region *region,
try(ndctl_dax, set_align, dax, p->align);
try(ndctl_dax, set_namespace, dax, ndns);
rc = ndctl_dax_enable(dax);
+ if (rc)
+ ndctl_dax_set_namespace(dax, NULL);
} else if (p->mode == NDCTL_NS_MODE_SAFE) {
struct ndctl_btt *btt = ndctl_region_get_btt_seed(region);
@@ -783,7 +787,13 @@ static int namespace_create(struct ndctl_region *region)
return -ENODEV;
}
- return setup_namespace(region, ndns, &p);
+ rc = setup_namespace(region, ndns, &p);
+ if (rc) {
+ ndctl_namespace_set_enforce_mode(ndns, NDCTL_NS_MODE_RAW);
+ ndctl_namespace_delete(ndns);
+ }
+
+ return rc;
}
static int zero_info_block(struct ndctl_namespace *ndns)
--
1.8.3.1
2 years, 1 month
[RFC PATCH] kvm: Use huge pages for DAX-backed files
by Barret Rhoden
This change allows KVM to map DAX-backed files made of huge pages with
huge mappings in the EPT/TDP.
DAX pages are not PageTransCompound. The existing check is trying to
determine if the mapping for the pfn is a huge mapping or not. For
non-DAX maps, e.g. hugetlbfs, that means checking PageTransCompound.
For DAX, we can check the page table itself. Actually, we might always
be able to walk the page table, even for PageTransCompound pages, but
it's probably a little slower.
Note that KVM already faulted in the page (or huge page) in the host's
page table, and we hold the KVM mmu spinlock (grabbed before checking
the mmu seq). Based on the other comments about not worrying about a
pmd split, we might be able to safely walk the page table without
holding the mm sem.
This patch relies on kvm_is_reserved_pfn() being false for DAX pages,
which I've hacked up for testing this code. That change should
eventually happen:
https://lore.kernel.org/lkml/20181022084659.GA84523@tiger-server/
Another issue is that kvm_mmu_zap_collapsible_spte() also uses
PageTransCompoundMap() to detect huge pages, but we don't have a way to
get the HVA easily. Can we just aggressively zap DAX pages there?
Alternatively, is there a better way to track at the struct page level
whether or not a page is huge-mapped? Maybe the DAX huge pages mark
themselves as TransCompound or something similar, and we don't need to
special case DAX/ZONE_DEVICE pages.
Signed-off-by: Barret Rhoden <brho(a)google.com>
---
arch/x86/kvm/mmu.c | 71 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 70 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index cf5f572f2305..9f3e0f83a2dd 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3152,6 +3152,75 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn)
return -EFAULT;
}
+static unsigned long pgd_mapping_size(struct mm_struct *mm, unsigned long addr)
+{
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ pgd = pgd_offset(mm, addr);
+ if (!pgd_present(*pgd))
+ return 0;
+
+ p4d = p4d_offset(pgd, addr);
+ if (!p4d_present(*p4d))
+ return 0;
+ if (p4d_huge(*p4d))
+ return P4D_SIZE;
+
+ pud = pud_offset(p4d, addr);
+ if (!pud_present(*pud))
+ return 0;
+ if (pud_huge(*pud))
+ return PUD_SIZE;
+
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_present(*pmd))
+ return 0;
+ if (pmd_huge(*pmd))
+ return PMD_SIZE;
+
+ pte = pte_offset_map(pmd, addr);
+ if (!pte_present(*pte))
+ return 0;
+ return PAGE_SIZE;
+}
+
+static bool pfn_is_pmd_mapped(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn)
+{
+ struct page *page = pfn_to_page(pfn);
+ unsigned long hva, map_sz;
+
+ if (!is_zone_device_page(page))
+ return PageTransCompoundMap(page);
+
+ /*
+ * DAX pages do not use compound pages. The page should have already
+ * been mapped into the host-side page table during try_async_pf(), so
+ * we can check the page tables directly.
+ */
+ hva = gfn_to_hva(kvm, gfn);
+ if (kvm_is_error_hva(hva))
+ return false;
+
+ /*
+ * Our caller grabbed the KVM mmu_lock with a successful
+ * mmu_notifier_retry, so we're safe to walk the page table.
+ */
+ map_sz = pgd_mapping_size(current->mm, hva);
+ switch (map_sz) {
+ case PMD_SIZE:
+ return true;
+ case P4D_SIZE:
+ case PUD_SIZE:
+ printk_once(KERN_INFO "KVM THP promo found a very large page");
+ return false;
+ }
+ return false;
+}
+
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
gfn_t *gfnp, kvm_pfn_t *pfnp,
int *levelp)
@@ -3168,7 +3237,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
*/
if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
level == PT_PAGE_TABLE_LEVEL &&
- PageTransCompoundMap(pfn_to_page(pfn)) &&
+ pfn_is_pmd_mapped(vcpu->kvm, gfn, pfn) &&
!mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
unsigned long mask;
/*
--
2.19.1.568.g152ad8e336-goog
2 years, 2 months
[ndctl PATCH v13 0/5] ndctl, monitor: add ndctl monitor daemon
by QI Fuli
This is the v13 patch for ndctl monitor, a tiny daemon to monitor
the smart events of nvdimm DIMMs. Since NVDIMM does not have a
feature like mirroring, if it breaks down, the data will be
impossible to restore. Ndctl monitor daemon will catch the smart
events notify from firmware and outputs notification to logfile,
therefore users can replace NVDIMM before it is completely broken.
Signed-off-by: QI Fuli <qi.fuli(a)jp.fujitsu.com>
---
Change log since v12:
- Fixing log_fn() for removing output new line
- Fixing hard code default configuration file path
- Fixing RPM spec file for configuration file and systemd unit file
- Fixing man page
Change log since v11:
- Adding log_standard()
- Adding [-u | --human] option
- Fixing man page
- Refactoring unit test
- Updating configuration file and systemd unit file to RPM spec file
Change log since v10:
- Adding unit test
- Adding fflush to log_file()
Change log since v9:
- Replacing ndctl_cmd_smart_get_event_flags() with
ndctl_dimm_get_event_flags()
- Adding ndctl_dimm_get_health() api
- Adding ndctl_dimm_get_flags() api
- Adding ndctl_dimm_is_flag_supported api
- Adding manpage
Change log since v8:
- Adding ndctl_cmd_smart_get_event_flags() api
- Adding monitor_filter_arg to the union in util_filter_ctx
- Removing is_dir()
- Replacing malloc + vsprintf with vasprintf() in log_file() and log_syslog()
- Adding parse_monitor_event()
- Refactoring util_dimm_event_filter()
- Adding event_flags to monitor
- Refactoring dimm_event_to_json()
- Adding check_dimm_supported_threshold_alarms()
- Fixing fail token
Change log since v7:
- Replacing logreport() with log_file() and log_syslog()
- Refactoring read_config_file()
- Replacing set_confile() with parse_config()
- Fixing the ndctl/ndct.conf file
Change log since v6:
- Changing License to GPL-2.0
- Adding event object to output notification
- Adding [--dimm-event] option to filter notification by event type
- Rewriting read_config_file()
- Replacing monitor_dimm_event() with monitor_event()
- Renaming some variables
Change log since v5:
- Fixing systemd unit file cannot be installed bug
- Adding license to ./util/abspath.c
Change log since v4:
- Adding OPTION_FILENAME to make sure filename is correct
- Adding configuration file
- Adding [--config-file] option to override the default configuration
- Making some options support multiple space-seperated arguments
- Making systemctl enable ndctl-monitor.service command work
- Making systemctl restart ndctl-monitor.service command work
- Making the directory of systemd unit file to be configurable
- Changing log_file() and log_syslog() to logreport()
- Changing date format in notification to nanoseconds since epoch
- Changing select() to epoll()
- Adding filter_bus() and filter_region()
Change log since v3:
- Removing create-monitor, show-monitor, list-monitor, destroy-monitor
- Adding [--daemon] option to run ndctl monitor as a daemon
- Using systemd to manage ndctl monitor daemon
- Replacing filter_monitor_dimm() with filter_dimm()
Change log since v2:
- Changing the interface of daemon to the ndctl command line
- Changing the name of daemon form "nvdimmd" to "monitor"
- Removing the config file, unit_file, nvdimmd dir
- Removing nvdimmd_test program
- Adding ndctl/monitor.c
Change log since v1:
- Adding a config file(/etc/nvdimmd/nvdimmd.conf)
- Using struct log_ctx instead of syslog()
- Using log_syslog() to save the notify messages to syslog
- Using log_file() to save the notify messages to special file
- Adding LOG_NOTICE level to log_priority
- Using automake instead of Makefile
- Adding a new util file(nvdimmd/util.c) including helper functions
needed for nvdimm daemon
- Adding nvdimmd_test program
QI Fuli (5):
ndctl, monitor: add a new command - monitor
ndctl, monitor: add main ndctl monitor configuration file
ndctl, monitor: add the unit file of systemd for ndctl-monitor service
ndctl, documentation: add man page for monitor
ndctl, test: add a new unit test for monitor
.gitignore | 1 +
Documentation/ndctl/Makefile.am | 3 +-
Documentation/ndctl/ndctl-monitor.txt | 108 +++++
autogen.sh | 3 +-
builtin.h | 1 +
configure.ac | 23 +
ndctl.spec.in | 3 +
ndctl/Makefile.am | 12 +-
ndctl/lib/libndctl.c | 82 ++++
ndctl/lib/libndctl.sym | 4 +
ndctl/libndctl.h | 10 +
ndctl/monitor.c | 650 ++++++++++++++++++++++++++
ndctl/monitor.conf | 41 ++
ndctl/ndctl-monitor.service | 7 +
ndctl/ndctl.c | 1 +
test/Makefile.am | 14 +-
test/list-smart-dimm.c | 117 +++++
test/monitor.sh | 176 +++++++
util/filter.h | 9 +
19 files changed, 1260 insertions(+), 5 deletions(-)
create mode 100644 Documentation/ndctl/ndctl-monitor.txt
create mode 100644 ndctl/monitor.c
create mode 100644 ndctl/monitor.conf
create mode 100644 ndctl/ndctl-monitor.service
create mode 100644 test/list-smart-dimm.c
create mode 100755 test/monitor.sh
--
2.18.0
2 years, 2 months
Problems with VM_MIXEDMAP removal from /proc/<pid>/smaps
by Jan Kara
Hello,
commit e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
mean time certain customer of ours started poking into /proc/<pid>/smaps
and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
flags, the application just fails to start complaining that DAX support is
missing in the kernel. The question now is how do we go about this?
Strictly speaking, this is a userspace visible regression (as much as I
think that application poking into VMA flags at this level is just too
bold). Is there any precedens in handling similar issues with smaps which
really exposes a lot of information that is dependent on kernel
implementation details?
I have attached a patch that is an obvious "fix" for the issue - just fake
VM_MIXEDMAP flag in smaps. But I'm open to other suggestions...
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
2 years, 2 months