全面掌握供应商管理工作的方法与方向
by 刁嗅
Message-ID: 1044078302284
From: =?lxnigqd??= <linux-nvdimm(a)lists.01.org>
To: <loc(a)fzclikdt.com>
详 情 请 阅 读 附 件
5 years
[PATCH] dax: Fix inefficiency in dax_writeback_mapping_range()
by Jan Kara
dax_writeback_mapping_range() fails to update iteration index when
searching radix tree for entries needing cache flushing. Thus each
pagevec worth of entries is searched starting from the start which is
inefficient and prone to livelocks. Update index properly.
CC: stable(a)vger.kernel.org
Fixes: 9973c98ecfda3a1dfcab981665b5f1e39bcde64a
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/dax.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/dax.c b/fs/dax.c
index 2a6889b3585f..9187f3b07f3e 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -859,6 +859,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
if (ret < 0)
goto out;
}
+ start_index = indices[pvec.nr - 1] + 1;
}
out:
put_dax(dax_dev);
--
2.12.3
5 years
[PATCH v3 00/14] pmem: stop abusing __copy_user_nocache(), and other reworks
by Dan Williams
Changes since v2 [1]:
1/ Address the concerns from "[NAK] copy_from_iter_ops()" [2]. The
copy_from_iter_ops approach is replaced with a new set _flushcache
memcpy and user-copy helpers (Al)
2/ Use _flushcache as the suffix for the new cache managing copy helpers
rather than _writethrough (Ingo and Toshi)
3/ Keep asm/pmem.h instead of moving the helpers to
drivers/nvdimm/$arch.c (another side effect of Al's feedback)
[1]: https://lkml.org/lkml/2017/4/21/823
[2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
---
A few months back, in the course of reviewing the memcpy_nocache()
proposal from Brian, Linus proposed that the pmem specific
memcpy_to_pmem() routine be moved to be implemented at the driver level
[3]:
"Quite frankly, the whole 'memcpy_nocache()' idea or (ab-)using
copy_user_nocache() just needs to die. It's idiotic.
As you point out, it's also fundamentally buggy crap.
Throw it away. There is no possible way this is ever valid or
portable. We're not going to lie and claim that it is.
If some driver ends up using 'movnt' by hand, that is up to that
*driver*. But no way in hell should we care about this one whit in
the sense of <linux/uaccess.h>."
This feedback also dovetails with another fs/dax.c design wart of being
hard coded to assume the backing device is pmem. We call the pmem
specific copy, clear, and flush routines even if the backing device
driver is one of the other 3 dax drivers (axonram, dccssblk, or brd).
There is no reason to spend cpu cycles flushing the cache after writing
to brd, for example, since it is using volatile memory for storage.
Moreover, the pmem driver might be fronting a volatile memory range
published by the ACPI NFIT, or the platform might have arranged to flush
cpu caches on power fail. This latter capability is a feature that has
appeared in embedded storage appliances (pre-ACPI-NFIT nvdimm
platforms).
Now, the comment about completely avoiding uaccess.h is augmented by
Al's recent assertion:
"And for !@#!@# sake, comments like this
+ * On x86_64 __copy_from_user_nocache() uses non-temporal stores
+ * for the bulk of the transfer, but we need to manually flush
+ * if the transfer is unaligned. A cached memory copy is used
+ * when destination or size is not naturally aligned. That is:
+ * - Require 8-byte alignment when size is 8 bytes or larger.
+ * - Require 4-byte alignment when size is 4 bytes.
mean only one thing: this should live in arch/x86/lib/usercopy_64.c,
right next to the actual function that does copying. NOT in
drivers/nvdimm/x86.c. At the very least it needs a comment in usercopy_64.c
with dire warnings along the lines of "don't touch that code without
looking into <filename>:pmem_from_user().."
So, this series proceeds to keep all the usercopy code centralized. The
change set:
1/ Moves what was previously named "the pmem api" out of the global
namespace and into the libnvdimm sub-system that needs to be
concerned with architecture specific persistent memory considerations.
2/ Arranges for dax to stop abusing __copy_user_nocache() and implements
formal _flushcache helpers that use 'movnt' on x86_64.
3/ Makes filesystem-dax cache maintenance optional by arranging for dax
to call driver specific copy and flush operations only if the driver
publishes them.
4/ Allows filesytem-dax cache management to be controlled by the block
device write-cache queue flag. The pmem driver is updated to clear
that flag by default when pmem is driving volatile memory. In the future
this same path may be used to detect platforms that have a
cpu-cache-flush-on-fail capability. That said, an administrator has the
option to force this behavior by writing to the $bdev/queue/write_cache
attribute in sysfs.
[3]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
This series is based on v4.12-rc4 and passes the current ndctl
regression suite.
---
Dan Williams (14):
x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations
dm: add ->copy_from_iter() dax operation support
filesystem-dax: convert to dax_copy_from_iter()
dax, pmem: introduce an optional 'flush' dax_operation
dm: add ->flush() dax operation support
filesystem-dax: convert to dax_flush()
x86, dax: replace clear_pmem() with open coded memset + dax_ops->flush
x86, dax, libnvdimm: move wb_cache_pmem() to libnvdimm
x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimm
pmem: remove global pmem api
libnvdimm, pmem: fix persistence warning
libnvdimm, nfit: enable support for volatile ranges
filesystem-dax: gate calls to dax_flush() on QUEUE_FLAG_WC
libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region
MAINTAINERS | 1
arch/x86/Kconfig | 1
arch/x86/include/asm/pmem.h | 81 ---------------------
arch/x86/include/asm/string_64.h | 5 +
arch/x86/include/asm/uaccess_64.h | 12 +++
arch/x86/lib/usercopy_64.c | 129 ++++++++++++++++++++++++++++++++++
drivers/acpi/nfit/core.c | 15 +++-
drivers/dax/super.c | 24 ++++++
drivers/md/dm-linear.c | 30 ++++++++
drivers/md/dm-stripe.c | 40 ++++++++++
drivers/md/dm.c | 45 ++++++++++++
drivers/nvdimm/bus.c | 8 +-
drivers/nvdimm/claim.c | 6 +-
drivers/nvdimm/core.c | 2 -
drivers/nvdimm/dax_devs.c | 2 -
drivers/nvdimm/dimm_devs.c | 10 ++-
drivers/nvdimm/namespace_devs.c | 14 +---
drivers/nvdimm/nd-core.h | 9 ++
drivers/nvdimm/pfn_devs.c | 4 +
drivers/nvdimm/pmem.c | 32 +++++++-
drivers/nvdimm/pmem.h | 13 +++
drivers/nvdimm/region_devs.c | 43 +++++++----
fs/dax.c | 11 ++-
include/linux/dax.h | 9 ++
include/linux/device-mapper.h | 6 ++
include/linux/libnvdimm.h | 2 +
include/linux/pmem.h | 142 -------------------------------------
include/linux/string.h | 6 ++
include/linux/uio.h | 15 ++++
lib/Kconfig | 3 +
lib/iov_iter.c | 22 ++++++
31 files changed, 464 insertions(+), 278 deletions(-)
delete mode 100644 include/linux/pmem.h
5 years
[RFC PATCH] daxctl: new utilities to enable / disable a file for static dax access
by Dan Williams
'daxctl enable' and 'daxctl disable' are wrapper utilities around the
new daxctl() syscall that pins / guarantees a given block-map for a
file.
The "dax.sh" unit test is extended to run its tests against the target
file in daxfile mode.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Makefile.am.in | 1
configure.ac | 8 ++
daxctl/Makefile.am | 14 +++
daxctl/dax.h | 8 ++
daxctl/daxctl.c | 4 +
daxctl/daxfile.c | 211 ++++++++++++++++++++++++++++++++++++++++++++++
daxctl/daxoff.in | 3 +
daxctl/daxon.in | 3 +
ndctl/lib/libndctl-ars.c | 6 -
test/Makefile.am | 4 +
test/dax-pmd.c | 77 +++++++++++++++++
test/dax.sh | 24 ++++-
util/size.h | 8 ++
13 files changed, 360 insertions(+), 11 deletions(-)
create mode 100644 daxctl/dax.h
create mode 100644 daxctl/daxfile.c
create mode 100644 daxctl/daxoff.in
create mode 100644 daxctl/daxon.in
diff --git a/Makefile.am.in b/Makefile.am.in
index 9cb8d4a055c7..28cb487b233d 100644
--- a/Makefile.am.in
+++ b/Makefile.am.in
@@ -11,6 +11,7 @@ AM_CPPFLAGS = \
-DNDCTL_MAN_PATH=\""$(mandir)"\" \
-I${top_srcdir}/ndctl/lib \
-I${top_srcdir}/ndctl \
+ -I${top_srcdir}/daxctl \
-I${top_srcdir}/ \
$(KMOD_CFLAGS) \
$(UDEV_CFLAGS) \
diff --git a/configure.ac b/configure.ac
index e79623ac1d82..0b8c68899374 100644
--- a/configure.ac
+++ b/configure.ac
@@ -123,6 +123,14 @@ AS_IF([test "x$enable_local" = "xyes"], [], [
]
)
+AS_IF([test "x$enable_local" = "xyes"], [], [
+ AC_CHECK_HEADER([linux/dax.h], [
+ AC_DEFINE([HAVE_DAX_H], [1],
+ [Define to 1 if you have <linux/dax.h>.])
+ ], [])
+ ]
+)
+
# when building against kernel headers check version specific features
AC_MSG_CHECKING([for ARS support])
AC_LANG(C)
diff --git a/daxctl/Makefile.am b/daxctl/Makefile.am
index fe467d030c38..8245b0888faf 100644
--- a/daxctl/Makefile.am
+++ b/daxctl/Makefile.am
@@ -2,9 +2,23 @@ include $(top_srcdir)/Makefile.am.in
bin_PROGRAMS = daxctl
+bin_SCRIPTS = daxon daxoff
+CLEANFILES += $(bin_SCRIPTS)
+EXTRA_DIST += daxon.in daxoff.in
+
+do_subst = sed -e 's,BINDIR,$(bindir),g'
+
+daxon: daxon.in
+ $(AM_V_GEN) $(do_subst) < $< > $@ && chmod +x $@
+
+daxoff: daxoff.in
+ $(AM_V_GEN) $(do_subst) < $< > $@ && chmod +x $@
+
daxctl_SOURCES =\
daxctl.c \
+ daxfile.c \
list.c \
+ ../util/log.c \
../util/json.c
daxctl_LDADD =\
diff --git a/daxctl/dax.h b/daxctl/dax.h
new file mode 100644
index 000000000000..1b5f87500c6c
--- /dev/null
+++ b/daxctl/dax.h
@@ -0,0 +1,8 @@
+#ifndef _LINUX_DAX_H
+#define _LINUX_DAX_H
+
+#define DAXCTL_F_GET (1 << 0)
+#define DAXCTL_F_DAX (1 << 1)
+#define DAXCTL_F_STATIC (1 << 2)
+
+#endif /* _LINUX_DAX_H */
diff --git a/daxctl/daxctl.c b/daxctl/daxctl.c
index 91a4600e262f..67b5a4cbce9a 100644
--- a/daxctl/daxctl.c
+++ b/daxctl/daxctl.c
@@ -67,10 +67,14 @@ static int cmd_help(int argc, const char **argv, void *ctx)
}
int cmd_list(int argc, const char **argv, void *ctx);
+int cmd_enable(int argc, const char **argv, void *ctx);
+int cmd_disable(int argc, const char **argv, void *ctx);
static struct cmd_struct commands[] = {
{ "version", cmd_version },
{ "list", cmd_list },
+ { "enable", cmd_enable },
+ { "disable", cmd_disable },
{ "help", cmd_help },
};
diff --git a/daxctl/daxfile.c b/daxctl/daxfile.c
new file mode 100644
index 000000000000..f8a18b973615
--- /dev/null
+++ b/daxctl/daxfile.c
@@ -0,0 +1,211 @@
+/*
+ * Copyright (c) 2015-2017 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+#include <stdio.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <assert.h>
+#include <limits.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <util/log.h>
+#include <util/size.h>
+#include <sys/syscall.h>
+#include <util/parse-options.h>
+#include <ccan/endian/endian.h>
+#include <ccan/short_types/short_types.h>
+
+#ifdef HAVE_DAX_H
+#include <linux/dax.h>
+#else
+#include <dax.h>
+#endif
+
+static struct parameters {
+ bool verbose;
+ bool check;
+ bool static_mode;
+ const char *align;
+} param;
+
+#define BASE_OPTIONS() \
+OPT_BOOLEAN('v', "verbose", ¶m.verbose, "enable extra logging"), \
+OPT_BOOLEAN('s', "static", ¶m.static_mode, \
+ "toggle / check <daxfile> static-dax capability")
+
+#define ENABLE_OPTIONS() \
+OPT_BOOLEAN('c', "check", ¶m.check, \
+ "check if <daxfile> is enabled for dax"), \
+OPT_STRING('a', "align", ¶m.align, "align", \
+ "specify expected minimum alignment of allocated extents")
+
+static const struct option enable_options[] = {
+ BASE_OPTIONS(),
+ ENABLE_OPTIONS(),
+ OPT_END(),
+};
+
+static const struct option disable_options[] = {
+ BASE_OPTIONS(),
+ OPT_END(),
+};
+
+struct dax_ctl {
+ struct log_ctx ctx;
+ const char *path;
+ struct stat stat;
+ int fd;
+};
+
+#ifdef __NR_daxctl
+#define daxctl(path, flags, align) \
+ syscall(__NR_daxctl, (path), (flags), (align))
+#else
+static int daxctl(const char *path, int flags, int align)
+{
+ errno = ENOTTY;
+ return -1;
+}
+#endif
+
+static bool enable_checks(struct dax_ctl *ctl)
+{
+ int fd;
+ struct stat st;
+
+ fd = open(ctl->path, O_RDONLY);
+ if (fd < 0) {
+ err(ctl, "failed to open %s (%s\n)", ctl->path, strerror(errno));
+ goto err;
+ }
+ ctl->fd = fd;
+
+ if (fstat(fd, &st) < 0) {
+ err(ctl, "failed to stat %s (%s\n)", ctl->path, strerror(errno));
+ goto err;
+ }
+
+ if (!S_ISREG(st.st_mode)) {
+ err(ctl, "error: %s not a regular file\n", ctl->path);
+ goto err;
+ }
+
+ /* test for holes by LBT */
+ if (st.st_blocks * 512 < st.st_size) {
+ err(ctl, "error: %s appears to be a sparse file\n", ctl->path);
+ goto err;
+ }
+
+ close(fd);
+ ctl->fd = -1;
+ return true;
+err:
+ if (fd >= 0)
+ close(fd);
+ ctl->fd = -1;
+ return false;
+}
+
+int cmd_enable(int argc, const char **argv, void *ctx)
+{
+ const char * const u[] = {
+ "daxctl enable <daxfile> [<options>]",
+ NULL
+ };
+ struct dax_ctl _ctl, *ctl = &_ctl;
+ int i, rc, flags = DAXCTL_F_STATIC;
+ unsigned long long align = 0; /* default to PAGE_SIZE */
+
+ argc = parse_options(argc, argv, enable_options, u, 0);
+ for (i = 1; i < argc; i++)
+ error("unknown parameter \"%s\"\n", argv[i]);
+
+ if (argc != 1) {
+ error("missing 'daxfile' to register\n");
+ usage_with_options(u, enable_options);
+ }
+
+ log_init(&ctl->ctx, "enable", "DAXCTL_ENABLE_LOGLEVEL");
+ if (param.verbose)
+ ctl->ctx.log_priority = LOG_DEBUG;
+ else
+ ctl->ctx.log_priority = LOG_INFO;
+
+ if (param.align) {
+ align = parse_size64(param.align);
+ if (align == ULLONG_MAX) {
+ error("could not parse --align parameter '%s'\n",
+ param.align);
+ return EXIT_FAILURE;
+ }
+ if (align > SZ_1G || !is_power_of_2(align)) {
+ error("invalid --align parameter '%s'\n",
+ param.align);
+ return EXIT_FAILURE;
+ }
+ }
+
+ ctl->path = argv[0];
+
+ if (param.check)
+ flags |= DAXCTL_F_GET;
+ else if (!enable_checks(ctl))
+ return EXIT_FAILURE;
+
+ rc = daxctl(ctl->path, flags, align);
+ if (rc < 0) {
+ err(ctl, "failed to %s daxfile: %s (%s)\n",
+ ctl->path, param.check ? "check" : "register",
+ strerror(errno));
+ return EXIT_FAILURE;
+ }
+
+ if (param.check && rc != DAXCTL_F_STATIC) {
+ dbg(ctl, "static-dax disabled for: %s\n", ctl->path);
+ return EXIT_FAILURE;
+ }
+ return EXIT_SUCCESS;
+}
+
+int cmd_disable(int argc, const char **argv, void *ctx)
+{
+ const char * const u[] = {
+ "daxctl disable <daxfile> [<options>]",
+ NULL
+ };
+ struct dax_ctl _ctl, *ctl = &_ctl;
+ int i, rc;
+
+ argc = parse_options(argc, argv, disable_options, u, 0);
+ for (i = 1; i < argc; i++)
+ error("unknown parameter \"%s\"\n", argv[i]);
+
+ if (argc != 1) {
+ error("missing 'daxfile' to unregister\n");
+ usage_with_options(u, disable_options);
+ }
+
+ log_init(&ctl->ctx, "disable", "DAXCTL_DISABLE_LOGLEVEL");
+ ctl->path = argv[0];
+
+ rc = daxctl(ctl->path, 0, 0);
+ if (rc < 0) {
+ err(ctl, "failed to unregister daxfile: %s (%s)\n",
+ ctl->path, strerror(errno));
+ return EXIT_FAILURE;
+ }
+ return EXIT_SUCCESS;
+}
diff --git a/daxctl/daxoff.in b/daxctl/daxoff.in
new file mode 100644
index 000000000000..a3e33364fd2c
--- /dev/null
+++ b/daxctl/daxoff.in
@@ -0,0 +1,3 @@
+#!/bin/sh
+
+BINDIR/daxctl disable $@
diff --git a/daxctl/daxon.in b/daxctl/daxon.in
new file mode 100644
index 000000000000..2380c32bed1a
--- /dev/null
+++ b/daxctl/daxon.in
@@ -0,0 +1,3 @@
+#!/bin/sh
+
+BINDIR/daxctl enable $@
diff --git a/ndctl/lib/libndctl-ars.c b/ndctl/lib/libndctl-ars.c
index 9b1a0cb6e1d6..1e463cf347a5 100644
--- a/ndctl/lib/libndctl-ars.c
+++ b/ndctl/lib/libndctl-ars.c
@@ -11,6 +11,7 @@
* more details.
*/
#include <stdlib.h>
+#include <util/size.h>
#include <ndctl/libndctl.h>
#include "libndctl-private.h"
@@ -44,11 +45,6 @@ NDCTL_EXPORT struct ndctl_cmd *ndctl_bus_cmd_new_ars_cap(struct ndctl_bus *bus,
}
#ifdef HAVE_NDCTL_CLEAR_ERROR
-static bool is_power_of_2(unsigned int v)
-{
- return v && ((v & (v - 1)) == 0);
-}
-
static bool validate_clear_error(struct ndctl_cmd *ars_cap)
{
if (!is_power_of_2(ars_cap->ars_cap->clear_err_unit))
diff --git a/test/Makefile.am b/test/Makefile.am
index 9353a34326c1..44a92e8d4864 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -75,7 +75,9 @@ parent_uuid_LDADD = $(LIBNDCTL_LIB) $(UUID_LIBS) $(KMOD_LIBS)
dax_dev_SOURCES = dax-dev.c $(testcore)
dax_dev_LDADD = $(LIBNDCTL_LIB) $(KMOD_LIBS)
-dax_pmd_SOURCES = dax-pmd.c
+dax_pmd_SOURCES = dax-pmd.c \
+ $(testcore)
+
mmap_SOURCES = mmap.c
dax_errors_SOURCES = dax-errors.c
daxdev_errors_SOURCES = daxdev-errors.c \
diff --git a/test/dax-pmd.c b/test/dax-pmd.c
index 6276913a0fda..264b28631e2a 100644
--- a/test/dax-pmd.c
+++ b/test/dax-pmd.c
@@ -24,7 +24,25 @@
#include <linux/fs.h>
#include <test.h>
#include <util/size.h>
+#include <sys/syscall.h>
#include <linux/fiemap.h>
+#include <linux/version.h>
+#ifdef HAVE_DAX_H
+#include <linux/dax.h>
+#else
+#include <dax.h>
+#endif
+
+#ifdef __NR_daxctl
+#define daxctl(path, flags, align) \
+ syscall(__NR_daxctl, (path), (flags), (align))
+#else
+static int daxctl(const char *path, int flags, int align)
+{
+ errno = ENOTTY;
+ return -1;
+}
+#endif
#define NUM_EXTENTS 5
#define fail() fprintf(stderr, "%s: failed at: %d\n", __func__, __LINE__)
@@ -185,8 +203,59 @@ static int test_pmd(int fd)
return rc;
}
+static int test_daxfile(char *daxfile)
+{
+ int fd, rc;
+
+ rc = daxctl(daxfile, DAXCTL_F_GET, 0);
+ if (rc < 0) {
+ fprintf(stderr, "%s: failed to retrieve dax flags: %s\n",
+ __func__, strerror(errno));
+ return -errno;
+ }
+ if (rc != 0) {
+ fprintf(stderr, "%s: expected dax flags %d got %d\n",
+ __func__, 0, rc);
+ return -ENXIO;
+ }
+
+ rc = daxctl(daxfile, DAXCTL_F_STATIC, 0);
+ if (rc < 0) {
+ fprintf(stderr, "%s: failed to set static dax: %s\n",
+ __func__, strerror(errno));
+ return -errno;
+ }
+
+ rc = daxctl(daxfile, DAXCTL_F_GET, 0);
+ if (rc < 0) {
+ fprintf(stderr, "%s: failed to retrieve dax flags: %s\n",
+ __func__, strerror(errno));
+ return -errno;
+ }
+ if (rc != DAXCTL_F_STATIC) {
+ fprintf(stderr, "%s: expected dax flags %d got %d\n",
+ __func__, DAXCTL_F_STATIC, rc);
+ return -ENXIO;
+ }
+
+ fd = open(daxfile, O_RDWR);
+ rc = test_pmd(fd);
+ if (rc)
+ return rc;
+
+ rc = daxctl(daxfile, 0, 0);
+ if (rc < 0) {
+ fprintf(stderr, "%s: failed to clear static dax: %s\n",
+ __func__, strerror(errno));
+ return -errno;
+ }
+
+ return 0;
+}
+
int __attribute__((weak)) main(int argc, char *argv[])
{
+ struct ndctl_test *test = ndctl_test_new(0);
int fd, rc;
if (argc < 1)
@@ -196,5 +265,11 @@ int __attribute__((weak)) main(int argc, char *argv[])
rc = test_pmd(fd);
if (fd >= 0)
close(fd);
- return rc;
+ if (rc)
+ return rc;
+
+ /* try the same test with a daxfile */
+ if (!ndctl_test_attempt(test, KERNEL_VERSION(4, 14, 0)))
+ return 0;
+ return test_daxfile(argv[1]);
}
diff --git a/test/dax.sh b/test/dax.sh
index e1f3c8f6ff79..4478b9a0295b 100755
--- a/test/dax.sh
+++ b/test/dax.sh
@@ -29,6 +29,22 @@ err() {
exit $rc
}
+alloc_file() {
+ # Note, we use dd here to get guarantees that the filesystem
+ # allocates the extents. ext4 is fine to use fallocate like so:
+ #
+ # fallocate -l 1GiB $MNT/$FILE
+ #
+ # ...but that does not work for xfs. xfs does support the -z
+ # option to allocate unwritten extents, like so:
+ #
+ # fallocate -z -l 1GiB $MNT/$FILE
+ #
+ # ...but that does not appear to work for ext4.
+
+ dd if=/dev/zero bs=1G count=1 of=$MNT/$FILE
+}
+
set -e
mkdir -p $MNT
trap 'err $LINENO' ERR
@@ -39,7 +55,7 @@ eval $(echo $json | sed -e "$json2var")
mkfs.ext4 /dev/$blockdev
mount /dev/$blockdev $MNT -o dax
-fallocate -l 1GiB $MNT/$FILE
+alloc_file
./dax-pmd $MNT/$FILE
umount $MNT
@@ -51,7 +67,7 @@ eval $(echo $json | sed -e "$json2var")
#note the blockdev returned from ndctl create-namespace lacks the /dev prefix
mkfs.ext4 /dev/$blockdev
mount /dev/$blockdev $MNT -o dax
-fallocate -l 1GiB $MNT/$FILE
+alloc_file
./dax-pmd $MNT/$FILE
umount $MNT
@@ -61,7 +77,7 @@ eval $(echo $json | sed -e "$json2var")
mkfs.xfs -f /dev/$blockdev
mount /dev/$blockdev $MNT -o dax
-fallocate -l 1GiB $MNT/$FILE
+alloc_file
./dax-pmd $MNT/$FILE
umount $MNT
@@ -72,7 +88,7 @@ eval $(echo $json | sed -e "$json2var")
mkfs.xfs -f /dev/$blockdev
mount /dev/$blockdev $MNT -o dax
-fallocate -l 1GiB $MNT/$FILE
+alloc_file
./dax-pmd $MNT/$FILE
umount $MNT
diff --git a/util/size.h b/util/size.h
index 3c27079fc2b8..f81a73ce884c 100644
--- a/util/size.h
+++ b/util/size.h
@@ -13,9 +13,12 @@
#ifndef _NDCTL_SIZE_H_
#define _NDCTL_SIZE_H_
+#include <stdbool.h>
#define SZ_1K 0x00000400
#define SZ_4K 0x00001000
+#define SZ_32K 0x00008000
+#define SZ_64K 0x00010000
#define SZ_1M 0x00100000
#define SZ_2M 0x00200000
#define SZ_4M 0x00400000
@@ -27,6 +30,11 @@
unsigned long long parse_size64(const char *str);
unsigned long long __parse_size64(const char *str, unsigned long long *units);
+static inline bool is_power_of_2(unsigned int v)
+{
+ return v && ((v & (v - 1)) == 0);
+}
+
#define ALIGN(x, a) ((((unsigned long long) x) + (a - 1)) & ~(a - 1))
#define BITS_PER_LONG (sizeof(unsigned long) * 8)
#define HPAGE_SIZE (2 << 20)
5 years
[PATCH v2 0/2] NVDIMM memory error notification support
by Toshi Kani
ACPI 6.2 defines a new ACPI notification value to NVDIMM Root Device.
This allows BIOS to inform the OS that new uncorrectable memory error
is detected during run-time.
This patch-set adds support of this notification, which starts ARS and
updates badblocks information to prevent the OS from consuming the new
error.
v2:
- Set flags Bit[1] for Start ARS. (Dan Williams)
Link: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
---
Toshi Kani (2):
1/2 acpi/nfit: Add support of NVDIMM memory error notification in ACPI 6.2
2/2 acpi/nfit: Issue Start ARS to retrieve existing records
---
drivers/acpi/nfit/core.c | 40 ++++++++++++++++++++++++++++++++--------
drivers/acpi/nfit/mce.c | 2 +-
drivers/acpi/nfit/nfit.h | 4 +++-
include/uapi/linux/ndctl.h | 1 +
4 files changed, 37 insertions(+), 10 deletions(-)
5 years
[PATCH v2 0/2] mm: force enable thp for dax
by Dan Williams
Changes since v1 [1]:
1/ Fix the transparent_hugepage_enabled() rewrite to be functionally
equivalent to the old state (Ross)
2/ Add a note as to why we are including fs.h in huge_mm.h so that we
remember to clean this up if vma_is_dax() is ever moved, or we add a
VM_* flag for this case. (prompted by Kirill's feedback).
3/ Add some ack and review tags.
[1]: https://www.spinics.net/lists/linux-mm/msg128852.html
---
Hi Andrew,
Please consider taking these 2 patches for 4.13. I spent some time
debugging why a user's device-dax configuration was always failing and
it turned out that their thp policy was set to 'never'. DAX should be
exempt from the policy since it is statically allocated and does not
suffer from any of the potentially negative side effects of thp. More
details in patch 2.
---
Dan Williams (2):
mm: improve readability of transparent_hugepage_enabled()
mm: always enable thp for dax mappings
include/linux/dax.h | 5 -----
include/linux/fs.h | 6 ++++++
include/linux/huge_mm.h | 37 ++++++++++++++++++++++++++-----------
3 files changed, 32 insertions(+), 16 deletions(-)
5 years
Fwd: Fw: Panic when make check for ndctl
by Yasunori Goto
Hello,
I tried "make check" of ndctl command on the kernel of libnvdimm-for-next branch(*),
but I feel its kernel is unstable when nfit_test.ko module is loaded.
(*) git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-for-next
The newest commit is f5705aa8cfed
Here is console logs when the kernel was paniced, and I'll attach the kernel config file.
Does anyone know how to solve this? Am I something wrong?
-------
CentOS Linux 7 (Core)
Kernel 4.12.0-rc1+ on an x86_64
goto-guest login: [ 75.737769] random: crng init done
[ 189.205125] nfit_test_iomap: loading out-of-tree module taints kernel.
[ 189.586917] nfit_test nfit_test.0: found a zero length table '0' parsing nfit
[ 191.644655] nfit_test nfit_test.0: failed to evaluate _FIT
[ 191.692708] nfit_test nfit_test.1: nmem4 flags: save_fail restore_fail flush_fail not_armed
[ 191.694838] nfit_test nfit_test.1: nmem5 flags: map_fail
[ 192.734002] nd_pmem namespace6.0: region6 read-only, marking pmem6 read-only
[ 192.743577] pmem7: detected capacity change from 0 to 4194304
[ 192.747323] pmem6: detected capacity change from 0 to 33554432
CentOS Linux 7 (Core)
Kernel 4.12.0-rc1+ on an x86_64
goto-guest login: [ 342.168046] pmem5: detected capacity change from 0 to 67108864
[ 342.253861] pmem5: detected capacity change from 0 to 63590400
[ 342.332442] pmem5: detected capacity change from 0 to 67108864
[ 342.376596] nd_pmem btt5.0: No existing arenas
[ 342.382494] pmem5s: detected capacity change from 0 to 65961984
[ 342.461783] pmem5: detected capacity change from 0 to 67108864
[ 342.542996] ndblk1.0: detected capacity change from 0 to 33554432
[ 342.614674] nd_blk btt1.0: No existing arenas
[ 342.629830] ndblk1.0s: detected capacity change from 0 to 32440320
[ 342.850342] pmem5: detected capacity change from 0 to 67108864
[ 342.935050] pmem5: detected capacity change from 0 to 63590400
[ 343.059219] pfn5.1: bad offset: 0x35b000 dax disabled align: 0x200000
[ 343.062980] pmem5: detected capacity change from 0 to 67108864
[ 343.240452] list_del corruption. next->prev should be ffff92f06f5ba940, but was (null)
[ 343.247955] ------------[ cut here ]------------
[ 343.249663] WARNING: CPU: 2 PID: 15846 at lib/list_debug.c:56 __list_del_entry_valid+0x6c/0xa0
[ 343.252757] Modules linked in: dax_pmem(O) device_dax(O) nd_pmem(O) nd_blk(O) nd_btt(O) nfit_test(O-) nfit(O) libnvdimm(O) nfit_test_iomap(O) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd
ppdev auth_rpcgss nfs_acl lockd parport_pc parport virtio_balloon grace sg pcspkr sunrpc i2c_piix4 i2c_core acpi_cpufreq ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi virtio_net
virtio_console virtio_blk ata_piix libata virtio_pci virtio_ring crc32c_intel serio_raw virtio floppy
[ 343.267337] CPU: 2 PID: 15846 Comm: modprobe Tainted: G O 4.12.0-rc1+ #3
[ 343.269805] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 343.271694] task: ffff92f068bed500 task.stack: ffff9ff05d4ac000
[ 343.273626] RIP: 0010:__list_del_entry_valid+0x6c/0xa0
[ 343.275271] RSP: 0018:ffff9ff05d4afd38 EFLAGS: 00010046
[ 343.276969] RAX: 0000000000000054 RBX: ffff9ff042d09000 RCX: 0000000000000000
[ 343.279206] RDX: 0000000000000000 RSI: ffff92f06f28e0a8 RDI: ffff92f06f28e0a8
[ 343.281545] RBP: ffff9ff05d4afd38 R08: 00000000fffffffe R09: 0000000000000236
[ 343.283737] R10: 0000000000000005 R11: 0000000000000235 R12: ffff9ff05d4afd70
[ 343.285799] R13: ffff9ff042d09000 R14: 0000000000000000 R15: ffff92f066ed3c00
[ 343.287859] FS: 00007f9a9c471740(0000) GS:ffff92f06f280000(0000) knlGS:0000000000000000
[ 343.290175] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 343.291852] CR2: 00007f9a9b914eb0 CR3: 00000003b58cd000 CR4: 00000000000006e0
[ 343.293975] Call Trace:
[ 343.294751] release_nodes+0x76/0x260
[ 343.295881] devres_release_all+0x3c/0x50
[ 343.297070] device_release_driver_internal+0x159/0x200
[ 343.298598] device_release_driver+0x12/0x20
[ 343.299919] bus_remove_device+0xfd/0x170
[ 343.301133] device_del+0x1e8/0x330
[ 343.302180] ? __getnstimeofday64+0x3c/0xd0
[ 343.303412] platform_device_del+0x28/0x90
[ 343.304626] platform_device_unregister+0x12/0x30
[ 343.306065] nfit_test_exit+0x2a/0x93b [nfit_test]
[ 343.307477] SyS_delete_module+0x171/0x250
[ 343.308716] do_syscall_64+0x67/0x150
[ 343.309832] entry_SYSCALL64_slow_path+0x25/0x25
[ 343.311204] RIP: 0033:0x7f9a9b945c27
[ 343.312263] RSP: 002b:00007ffef08b9728 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 343.314468] RAX: ffffffffffffffda RBX: 0000000001888490 RCX: 00007f9a9b945c27
[ 343.316535] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000018884f8
[ 343.318615] RBP: 0000000000000000 R08: 00007f9a9bc09060 R09: 00007f9a9b9b6000
[ 343.320682] R10: 00007ffef08b94b0 R11: 0000000000000206 R12: 0000000000000000
[ 343.322758] R13: 0000000000000001 R14: 00000000018884f8 R15: 0000000000000000
[ 343.324771] Code: 48 89 c2 48 89 fe 31 c0 48 c7 c7 48 6a a7 b9 e8 fa 37 e0 ff 0f ff 31 c0 5d c3 48 89 fe 31 c0 48 c7 c7 f8 6a a7 b9 e8 e3 37 e0 ff <0f> ff 31 c0 5d c3 48 89 fe 31 c0 48 c7 c7 b8 6a
a7 b9 e8 cc 37 [ 343.329870] ---[ end trace 7b42652be22fe7db ]---
[ 343.331207] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 343.333310] IP: __list_del_entry_valid+0x29/0xa0
[ 343.334618] PGD 3b5a8b067 [ 343.334619] P4D 3b5a8b067 [ 343.335365] PUD 3b5ad1067 [ 343.336109] PMD 0 [ 343.336856] [ 343.337886] Oops: 0000 [#1] SMP
[ 343.338809] Modules linked in: dax_pmem(O) device_dax(O) nd_pmem(O) nd_blk(O) nd_btt(O) nfit_test(O-) nfit(O) libnvdimm(O) nfit_test_iomap(O) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd
ppdev auth_rpcgss nfs_acl lockd parport_pc parport virtio_balloon grace sg pcspkr sunrpc i2c_piix4 i2c_core acpi_cpufreq ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi virtio_net
virtio_console virtio_blk ata_piix libata virtio_pci virtio_ring crc32c_intel serio_raw virtio floppy
[ 343.350897] CPU: 2 PID: 15846 Comm: modprobe Tainted: G W O 4.12.0-rc1+ #3
[ 343.352954] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 343.354403] task: ffff92f068bed500 task.stack: ffff9ff05d4ac000
[ 343.355876] RIP: 0010:__list_del_entry_valid+0x29/0xa0
[ 343.357213] RSP: 0018:ffff9ff05d4afd38 EFLAGS: 00010007
[ 343.358515] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000200
[ 343.360275] RDX: 0000000000000000 RSI: ffff9ff05d4afd70 RDI: ffff9ff042d09000
[ 343.362026] RBP: ffff9ff05d4afd38 R08: 00000000fffffffe R09: ffff9ff042d09000
[ 343.363830] R10: 0000000000000005 R11: ffff92f06f5ba920 R12: ffff9ff05d4afd70
[ 343.365498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff92f066ed3c00
[ 343.367137] FS: 00007f9a9c471740(0000) GS:ffff92f06f280000(0000) knlGS:0000000000000000
[ 343.369017] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 343.370362] CR2: 0000000000000000 CR3: 00000003b58cd000 CR4: 00000000000006e0
[ 343.372108] Call Trace:
[ 343.372700] release_nodes+0x76/0x260
[ 343.373560] devres_release_all+0x3c/0x50
[ 343.374532] device_release_driver_internal+0x159/0x200
[ 343.375771] device_release_driver+0x12/0x20
[ 343.376834] bus_remove_device+0xfd/0x170
[ 343.377799] device_del+0x1e8/0x330
[ 343.378670] ? __getnstimeofday64+0x3c/0xd0
[ 343.379657] platform_device_del+0x28/0x90
[ 343.380687] platform_device_unregister+0x12/0x30
[ 343.381796] nfit_test_exit+0x2a/0x93b [nfit_test]
[ 343.382926] SyS_delete_module+0x171/0x250
[ 343.383904] do_syscall_64+0x67/0x150
[ 343.384779] entry_SYSCALL64_slow_path+0x25/0x25
[ 343.385870] RIP: 0033:0x7f9a9b945c27
[ 343.386716] RSP: 002b:00007ffef08b9728 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 343.388484] RAX: ffffffffffffffda RBX: 0000000001888490 RCX: 00007f9a9b945c27
[ 343.390217] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000018884f8
[ 343.391875] RBP: 0000000000000000 R08: 00007f9a9bc09060 R09: 00007f9a9b9b6000
[ 343.393619] R10: 00007ffef08b94b0 R11: 0000000000000206 R12: 0000000000000000
[ 343.395291] R13: 0000000000000001 R14: 00000000018884f8 R15: 0000000000000000
[ 343.397010] Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 60 <48> 8b 12 48 39 d7 75 41 48 8b 50 08 48 39 d7 75
21 b8 01 00 00 [ 343.401444] RIP: __list_del_entry_valid+0x29/0xa0 RSP: ffff9ff05d4afd38
[ 343.402972] CR2: 0000000000000000
[ 343.403774] ---[ end trace 7b42652be22fe7dc ]---
[ 343.404872] Kernel panic - not syncing: Fatal exception
[ 343.438096] Kernel Offset: 0x38000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 343.440605] ---[ end Kernel panic - not syncing: Fatal exception
--
Yasunori Goto <y-goto(a)jp.fujitsu.com>
5 years
[PATCH] ndctl, completion: show idle namespaces for 'destroy'
by Vishal Verma
The canonical way to destroy a namespace is to first disable it, then
destroy it. Thus typically, namespaces about to be destroyed will be
'idle'. Enable idle namespaces for completion of destroy-namespace.
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
contrib/ndctl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/contrib/ndctl b/contrib/ndctl
index 076423f..fd560cb 100755
--- a/contrib/ndctl
+++ b/contrib/ndctl
@@ -203,7 +203,7 @@ __ndctl_comp_non_option_args()
disable-namespace)
;&
destroy-namespace)
- opts="$(__ndctl_get_ns) all"
+ opts="$(__ndctl_get_ns -i) all"
;;
check-namespace)
opts="$(__ndctl_get_ns -i) all"
--
2.9.3
5 years
[PATCH 0/9] libnvdimm, label: namespace specification v1.2 support
by Dan Williams
The recently released UEFI 2.7 Specification [1], includes an updated
version (v1.2) of the NVDIMM Namespace Label Specification that was previously
published on pmem.io [2] (v1.1).
In the process of moving to a UEFI standard definition the v1.2 updates
adds several features for improved cross-OS and pre-OS (EFI driver)
compatibility and safety. The major highlights include:
1/ Support for an "address abstraction guid" so that implementations can
uniquely identify personalities layered on top of a namespace. A
standard address abstraction definition example is the BTT (Block
Translation Table for sector atomicity) layout. A private / local
abstraction definition example is the Linux device-DAX personality.
2/ Checksums for individual label slots
3/ Additional safety and self-consistency properties like an updated
interleave-set-cookie algorithm and recording the NFIT address-type-guid
in the namespace.
UEFI mandates that these labels be accessed through new ACPI methods
_LSI, _LSR, and _LSW (Label Storage {Info,Read,Write}), however support
for those is saved for a later patch series once the ACPICA enabling for
ACPI 6.2 lands in an immutable form in the acpi tree.
These updates pass a run through the nvdimm unit tests and an updated
version of the tests targeting the address-abstraction guid. This set is
based on the 'uuid-types' branch of git.infradead.org/users/hch/uuid.git
which includes Christoph's and Andy's revamp of the kernel's uuid + guid
helper routines.
[1]: http://www.uefi.org/sites/default/files/resources/UEFI_Spec_2_7.pdf
[2]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
---
Dan Williams (9):
libnvdimm, label: add v1.2 nvdimm label definitions
libnvdimm, label: add v1.2 interleave-set-cookie algorithm
libnvdimm, label: honor the lba size specified in v1.2 labels
libnvdimm, label: populate the type_guid property for v1.2 namespaces
libnvdimm, label: populate 'isetcookie' for blk-aperture namespaces
libnvdimm, label: update 'nlabel' and 'position' handling for local namespaces
libnvdimm, label: add v1.2 label checksum support
libnvdimm, label: add address abstraction identifiers
libnvdimm, label: switch to using v1.2 labels by default
drivers/acpi/nfit/core.c | 67 +++++++++--
drivers/nvdimm/btt_devs.c | 8 +
drivers/nvdimm/claim.c | 28 ++++
drivers/nvdimm/core.c | 3
drivers/nvdimm/dax_devs.c | 8 +
drivers/nvdimm/label.c | 244 +++++++++++++++++++++++++++++++++++----
drivers/nvdimm/label.h | 20 +++
drivers/nvdimm/namespace_devs.c | 211 +++++++++++++++++++++++++++++-----
drivers/nvdimm/nd.h | 13 ++
drivers/nvdimm/pfn_devs.c | 8 +
drivers/nvdimm/pmem.c | 1
drivers/nvdimm/region_devs.c | 43 ++++++-
include/linux/libnvdimm.h | 8 +
include/linux/nd.h | 12 ++
14 files changed, 596 insertions(+), 78 deletions(-)
5 years