[PATCH v3 0/2] Support ACPI 6.1 update in NFIT Control Region Structure
by Toshi Kani
ACPI 6.1, Table 5-133, updates NVDIMM Control Region Structure as
follows.
- Valid Fields, Manufacturing Location, and Manufacturing Date
are added from reserved range. No change in the structure size.
- IDs (SPD values) are stored as arrays of bytes (i.e. big-endian
format). The spec clarifies that they need to be represented
as arrays of bytes as well.
Patch 1 changes the NFIT driver to comply with ACPI 6.1.
Patch 2 adds a new sysfs file "id" to show NVDIMM ID defined in ACPI 6.1.
The patch-set applies on linux-pm.git acpica.
link: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
---
v3:
- Need to coordinate with ACPICA update (Bob Moore, Dan Williams)
- Integrate with ACPICA changes in struct acpi_nfit_control_region.
(commit 138a95547ab0)
v2:
- Remove 'mfg_location' and 'mfg_date'. (Dan Williams)
- Rename 'unique_id' to 'id' and make this change as a separate patch.
(Dan Williams)
---
Toshi Kani (3):
1/2 acpi/nfit: Update nfit driver to comply with ACPI 6.1
2/3 acpi/nfit: Add sysfs "id" for NVDIMM ID
---
drivers/acpi/nfit.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
3 years, 11 months
Re: [PATCH] nvdimm: Remove minimum size requirement
by Soccer Liu
Hi:
As part of processing in setting up the environment for running unitests, I was able to work through the instrcutions in https://github.com/pmem/ndctl/tree/0a628fdf4fe58a283b16c1bbaa49bb28b1842b... the way until I hit the followingbuild error (Segmentation fault) when buiding libnvdimm.o.
Anyone hit this before?
root@ubuntu:/home/soccerl/nvdimm# make M=tools/testing/nvdimm AR tools/testing/nvdimm/built-in.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/core.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/bus.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dimm_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dimm.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/region_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/region.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/namespace_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/label.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/claim.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/btt_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/pfn_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dax_devs.o CC [M] tools/testing/nvdimm/config_check.o LD [M] tools/testing/nvdimm/libnvdimm.oSegmentation faultscripts/Makefile.build:548: recipe for target 'tools/testing/nvdimm/libnvdimm.o' failedmake[1]: *** [tools/testing/nvdimm/libnvdimm.o] Error 139Makefile:1511: recipe for target '_module_tools/testing/nvdimm' failedmake: *** [_module_tools/testing/nvdimm] Error 2
My devbox has 4.13 Linux in it.I am not sure whether it has anything to do with fact that I didnt do anything with ndctl/ndctl.spec.in (because I am not sure how to apply those dependendies to my testbox)
Any idea?
ThanksCheng-mean
On Thursday, August 31, 2017 3:31 PM, Dan Williams <dan.j.williams(a)intel.com> wrote:
On Mon, Aug 7, 2017 at 11:13 AM, Dan Williams <dan.j.williams(a)intel.com> wrote:
> On Mon, Aug 7, 2017 at 11:09 AM, Cheng-mean Liu (SOCCER)
> <soccerl(a)microsoft.com> wrote:
>> Hi Dan:
>>
>> I am wondering if failing on those unittests is still an issue for this minimum size requirement change.
>
> Yes, I just haven't had a chance to circle back and get this fixed up.
>
> You can reproduce by running:
>
> make TESTS=dpa-alloc check
>
> ...in a checkout of the ndctl project: https://github.com/pmem/ndctl
>
> If you attempt that, note the required setup of the nfit_test modules
> documented in README.md in that same repository.
I have not had any time to fix up the unit test for this. Soccer, can
you take a look?
4 years, 4 months
[PATCH v6 0/8] libnvdimm: add DMA supported blk-mq pmem driver
by Dave Jiang
v6:
- Put all common code for pmem drivers in pmem_core per Dan's suggestion.
- Added support code to get number of available DMA chans
- Fixed up Kconfig so that when pmem is built into the kernel, pmem_dma won't
show up.
v5:
- Added support to report descriptor transfer capability limit from dmaengine.
- Fixed up scatterlist support for dma_unmap_data per Dan's comments.
- Made the driver a separate pmem blk driver per Christoph's suggestion
and also fixed up all the issues pointed out by Christoph.
- Added pmem badblock checking/handling per Robert and also made DMA op to
be used by all buffer sizes.
v4:
- Addressed kbuild test bot issues. Passed kbuild test bot, 179 configs.
v3:
- Added patch to rename DMA_SG to DMA_SG_SG to make it explicit
- Added DMA_MEMCPY_SG transaction type to dmaengine
- Misc patch to add verification of DMA_MEMSET_SG that was missing
- Addressed all nd_pmem driver comments from Ross.
v2:
- Make dma_prep_memcpy_* into one function per Dan.
- Addressed various comments from Ross with code formatting and etc.
- Replaced open code with offset_in_page() macro per Johannes.
The following series implements a blk-mq pmem driver and
also adds infrastructure code to ioatdma and dmaengine in order to
support copying to and from scatterlist in order to process block
requests provided by blk-mq. The usage of DMA engines available on certain
platforms allow us to drastically reduce CPU utilization and at the same time
maintain performance that is good enough. Experimentations have been done on
DRAM backed pmem block device that showed the utilization of DMA engine is
beneficial. By default nd_pmem.ko will be loaded. This can be overridden
through module blacklisting in order to load nd_pmem_dma.ko.
---
Dave Jiang (8):
dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels
dmaengine: Add DMA_MEMCPY_SG transaction op
dmaengine: add verification of DMA_MEMSET_SG in dmaengine
dmaengine: ioatdma: dma_prep_memcpy_sg support
dmaengine: add function to provide per descriptor xfercap for dma engine
dmaengine: add SG support to dmaengine_unmap
dmaengine: provide number of available channels
libnvdimm: Add blk-mq pmem driver
Documentation/dmaengine/provider.txt | 3
drivers/dma/dmaengine.c | 76 ++++
drivers/dma/ioat/dma.h | 4
drivers/dma/ioat/init.c | 6
drivers/dma/ioat/prep.c | 57 +++
drivers/nvdimm/Kconfig | 21 +
drivers/nvdimm/Makefile | 6
drivers/nvdimm/pmem.c | 264 ---------------
drivers/nvdimm/pmem.h | 48 +++
drivers/nvdimm/pmem_core.c | 298 +++++++++++++++++
drivers/nvdimm/pmem_dma.c | 606 ++++++++++++++++++++++++++++++++++
include/linux/dmaengine.h | 49 +++
12 files changed, 1170 insertions(+), 268 deletions(-)
create mode 100644 drivers/nvdimm/pmem_core.c
create mode 100644 drivers/nvdimm/pmem_dma.c
--
Signature
4 years, 6 months
Re: KVM "fake DAX" flushing interface - discussion
by Dan Williams
On Wed, Jul 26, 2017 at 2:27 PM, Rik van Riel <riel(a)redhat.com> wrote:
> On Wed, 2017-07-26 at 09:47 -0400, Pankaj Gupta wrote:
>> >
>> Just want to summarize here(high level):
>>
>> This will require implementing new 'virtio-pmem' device which
>> presents
>> a DAX address range(like pmem) to guest with read/write(direct
>> access)
>> & device flush functionality. Also, qemu should implement
>> corresponding
>> support for flush using virtio.
>>
> Alternatively, the existing pmem code, with
> a flush-only block device on the side, which
> is somehow associated with the pmem device.
>
> I wonder which alternative leads to the least
> code duplication, and the least maintenance
> hassle going forward.
I'd much prefer to have another driver. I.e. a driver that refactors
out some common pmem details into a shared object and can attach to
ND_DEVICE_NAMESPACE_{IO,PMEM}. A control device on the side seems like
a recipe for confusion.
With a $new_driver in hand you can just do:
modprobe $new_driver
echo $namespace > /sys/bus/nd/drivers/nd_pmem/unbind
echo $namespace > /sys/bus/nd/drivers/$new_driver/new_id
echo $namespace > /sys/bus/nd/drivers/$new_driver/bind
...and the guest can arrange for $new_driver to be the default, so you
don't need to do those steps each boot of the VM, by doing:
echo "blacklist nd_pmem" > /etc/modprobe.d/virt-dax-flush.conf
echo "alias nd:t4* $new_driver" >> /etc/modprobe.d/virt-dax-flush.conf
echo "alias nd:t5* $new_driver" >> /etc/modprobe.d/virt-dax-flush.conf
4 years, 6 months
Enabling peer to peer device transactions for PCIe devices
by Deucher, Alexander
This is certainly not the first time this has been brought up, but I'd like to try and get some consensus on the best way to move this forward. Allowing devices to talk directly improves performance and reduces latency by avoiding the use of staging buffers in system memory. Also in cases where both devices are behind a switch, it avoids the CPU entirely. Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are pointer based. Ideally we'd be able to take a CPU virtual address and be able to get to a physical address taking into account IOMMUs, etc. Having struct pages for the memory would allow it to work more generally and wouldn't require as much explicit support in drivers that wanted to use it.
Some use cases:
1. Storage devices streaming directly to GPU device memory
2. GPU device memory to GPU device memory streaming
3. DVB/V4L/SDI devices streaming directly to GPU device memory
4. DVB/V4L/SDI devices streaming directly to storage devices
Here is a relatively simple example of how this could work for testing. This is obviously not a complete solution.
- Device memory will be registered with Linux memory sub-system by created corresponding struct page structures for device memory
- get_user_pages_fast() will return corresponding struct pages when CPU address points to the device memory
- put_page() will deal with struct pages for device memory
Previously proposed solutions and related proposals:
1.P2P DMA
DMA-API/PCI map_peer_resource support for peer-to-peer (http://www.spinics.net/lists/linux-pci/msg44560.html)
Pros: Low impact, already largely reviewed.
Cons: requires explicit support in all drivers that want to support it, doesn't handle S/G in device memory.
2. ZONE_DEVICE IO
Direct I/O and DMA for persistent memory (https://lwn.net/Articles/672457/)
Add support for ZONE_DEVICE IO memory with struct pages. (https://patchwork.kernel.org/patch/8583221/)
Pro: Doesn't waste system memory for ZONE metadata
Cons: CPU access to ZONE metadata slow, may be lost, corrupted on device reset.
3. DMA-BUF
RDMA subsystem DMA-BUF support (http://www.spinics.net/lists/linux-rdma/msg38748.html)
Pros: uses existing dma-buf interface
Cons: dma-buf is handle based, requires explicit dma-buf support in drivers.
4. iopmem
iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/)
5. HMM
Heterogeneous Memory Management (http://lkml.iu.edu/hypermail/linux/kernel/1611.2/02473.html)
6. Some new mmap-like interface that takes a userptr and a length and returns a dma-buf and offset?
Alex
4 years, 6 months
[RFC patch 1/4]ndctl: nvdimmd: notify/monitor the feathers of over threshold event
by Qi, Fuli
Libnvdimmd.c provides functions which are used by nvdimm daemon, and
currently it just supports for logging.
Libnvdimmd.h is a head file of libnvdimmd.c.
Since I do not use automake, I defined gentenv.h to compile instead of it temporarily.
So I suppose more good way is necessary.
Signed-off-by: QI Fuli <qi.fuli(a)jp.fujitsu.com>
---
nvdimmd/Makefile | 7 +++
nvdimmd/getenv.h | 1 +
nvdimmd/libnvdimmd.c | 141 +++++++++++++++++++++++++++++++++++++++++++++++++++
nvdimmd/libnvdimmd.h | 31 +++++++++++
4 files changed, 180 insertions(+)
diff --git a/nvdimmd/Makefile b/nvdimmd/Makefile
new file mode 100644
index 0000000..a20a747
--- /dev/null
+++ b/nvdimmd/Makefile
@@ -0,0 +1,7 @@
+CC = gcc
+IDIR = -I../ -I../ndctl
+
+libnvdimmd.o: libnvdimmd.c
+ $(CC) -o libnvdimmd.o $(IDIR) -c libnvdimmd.c
+clean:
+ rm -rf *.o
diff --git a/nvdimmd/getenv.h b/nvdimmd/getenv.h
new file mode 100644
index 0000000..45747b4
--- /dev/null
+++ b/nvdimmd/getenv.h
@@ -0,0 +1 @@
+#define HAVE_SECURE_GETENV 1
diff --git a/nvdimmd/libnvdimmd.c b/nvdimmd/libnvdimmd.c
new file mode 100644
index 0000000..89ff701
--- /dev/null
+++ b/nvdimmd/libnvdimmd.c
@@ -0,0 +1,141 @@
+/*
+ * Copyright (c) 2017, FUJITSU LIMITED. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU Lesser General Public License,
+ * version 2.1, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT ANY
+ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for
+ * more details.
+ */
+
+/*
+ * This program is used to provide nvdimm daemon necessary functions.
+ */
+
+#include "getenv.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <syslog.h>
+#include <ndctl/libndctl.h>
+#include <ndctl/lib/libndctl-private.h>
+#include "libnvdimmd.h"
+#define BUF_SIZE 4096
+
+static int get_health_info(threshold_dimm *t_dimm)
+{
+ struct ndctl_cmd *cmd;
+ int rc;
+ unsigned int flags;
+ char *msg = "nvdimm warning: dimm over threshold notify";
+ char *err_msg;
+
+ cmd = ndctl_dimm_cmd_new_smart(t_dimm->dimm);
+ if (!cmd) {
+ err_msg = "failed to prepare command to get health info";
+ syslog(LOG_WARNING, "%s [%s], %s\n", msg, t_dimm->devname, err_msg);
+ return -1;
+ }
+
+ rc = ndctl_cmd_submit(cmd);
+ if (rc || ndctl_cmd_get_firmware_status(cmd)) {
+ err_msg = "failed to submit command to get health info";
+ syslog(LOG_WARNING, "%s [%s], %s\n", msg, t_dimm->devname, err_msg);
+ ndctl_cmd_unref(cmd);
+ return -1;
+ }
+
+ flags = ndctl_cmd_smart_get_flags(cmd);
+ if (flags & ND_SMART_HEALTH_VALID) {
+ unsigned int health = ndctl_cmd_smart_get_health(cmd);
+ if (health & ND_SMART_FATAL_HEALTH)
+ t_dimm->health_state = "fatal";
+ else if (health & ND_SMART_CRITICAL_HEALTH)
+ t_dimm->health_state = "critical";
+ else if (health & ND_SMART_NON_CRITICAL_HEALTH)
+ t_dimm->health_state = "non-critical";
+ else
+ t_dimm->health_state = "ok";
+ } else {
+ t_dimm->health_state = "failed to get data";
+ }
+ if (flags & ND_SMART_SPARES_VALID)
+ t_dimm->spares = ndctl_cmd_smart_get_spares(cmd);
+ else
+ t_dimm->spares = -1;
+
+ ndctl_cmd_unref(cmd);
+ return 0;
+}
+
+int log_notify(threshold_dimm *t_dimm, int count_dimm, fd_set fds, int count_select)
+{
+ int log_notify = 0;
+ char *msg = "nvdimm warning: dimm over threshold notify";
+
+ for (int i = 0; i < count_dimm; i++) {
+ if (log_notify >= count_select)
+ break;
+
+ if (!FD_ISSET(t_dimm[i].health_eventfd, &fds))
+ continue;
+
+ log_notify++;
+ if (get_health_info(&t_dimm[i]))
+ continue;
+
+ if (t_dimm[i].spares == -1) {
+ syslog(LOG_WARNING,
+ "%s [%s]\nhealth_state: %s\n"
+ "spares_percentage: failed to get data\n",
+ msg, t_dimm[i].devname, t_dimm[i].health_state);
+ continue;
+ }
+ syslog(LOG_WARNING,"%s [%s]\nhealth_state: %s\nspares_percentage: %d\n",
+ msg, t_dimm[i].devname, t_dimm[i].health_state, t_dimm[i].spares);
+ }
+ return log_notify;
+}
+
+static struct ndctl_dimm *is_supported_threshold_notify(struct ndctl_dimm *dimm)
+{
+ if (ndctl_dimm_is_cmd_supported(dimm, ND_CMD_SMART_THRESHOLD))
+ return dimm;
+ return NULL;
+}
+
+int
+get_threshold_dimm(struct ndctl_ctx *ctx, threshold_dimm *t_dimm, fd_set *fds, int *maxfd)
+{
+ struct ndctl_bus *bus;
+ struct ndctl_dimm *dimm;
+ char buf[BUF_SIZE];
+ int fd, count_dimm = 0;
+
+ ndctl_bus_foreach(ctx, bus) {
+ ndctl_dimm_foreach(bus, dimm) {
+
+ if (!is_supported_threshold_notify(dimm))
+ continue;
+ t_dimm[count_dimm].dimm = dimm;
+ t_dimm[count_dimm].devname = ndctl_dimm_get_devname(dimm);
+ fd = ndctl_dimm_get_health_eventfd(dimm);
+ read(fd, buf, sizeof(buf));
+ t_dimm[count_dimm].health_eventfd = fd;
+
+ if (fds)
+ FD_SET(fd, fds);
+ if (maxfd) {
+ if (*maxfd < fd)
+ *maxfd = fd;
+ }
+ count_dimm++;
+ }
+ }
+ return count_dimm;
+}
diff --git a/nvdimmd/libnvdimmd.h b/nvdimmd/libnvdimmd.h
new file mode 100644
index 0000000..78e5871
--- /dev/null
+++ b/nvdimmd/libnvdimmd.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (c) 2017, FUJITSU LIMITED. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU Lesser General Public License,
+ * version 2.1, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT ANY
+ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for
+ * more details.
+ */
+
+#ifndef _LIBNVDIMMD_H_
+#define _LIBBVDIMMD_H_
+
+#include <stdio.h>
+#define NUM_MAX_DIMM 1024
+
+typedef struct {
+ struct ndctl_dimm *dimm;
+ const char *devname;
+ int health_eventfd;
+ int spares;
+ char *health_state;
+} threshold_dimm;
+
+int log_notify(threshold_dimm *t_dimm, int count_dimm, fd_set fds, int count_select);
+int get_threshold_dimm(struct ndctl_ctx *ctx, threshold_dimm *t_dimm, fd_set *fds, int *maxfd);
+
+#endif
--
QI Fuli <qi.fuli(a)jp.fujitsu.com>
4 years, 7 months
[PATCH v2 0/5] ext4: DAX data corruption fixes
by Ross Zwisler
This series prevents a pair of data corruptions with ext4 + DAX. The first
such corruption happens when combining the inline data feature with DAX,
and the second happens when combining data journaling with DAX.
Both can be reliably reproduced with the fstests that I have posted here:
https://patchwork.kernel.org/patch/9948377/
https://patchwork.kernel.org/patch/9948381/
My opinion is that the first three patches in this series should be applied
to the v4.14 RC series and backported to stable. The last two patches in
this series are just cleanup and can probably wait until v4.15.
Ross Zwisler (5):
ext4: prevent data corruption with inline data + DAX
ext4: prevent data corruption with journaling + DAX
ext4: add sanity check for encryption + DAX
ext4: add ext4_should_use_dax()
ext4: remove duplicate extended attributes defs
fs/ext4/ext4.h | 37 -------------------------------------
fs/ext4/inline.c | 10 ----------
fs/ext4/inode.c | 24 ++++++++++++++++--------
fs/ext4/ioctl.c | 16 +++++++++++++---
fs/ext4/super.c | 8 ++++++++
5 files changed, 37 insertions(+), 58 deletions(-)
--
2.9.5
4 years, 7 months
[RFC 00/16] NOVA: a new file system for persistent memory
by Steven Swanson
This is an RFC patch series that impements NOVA (NOn-Volatile memory
Accelerated file system), a new file system built for PMEM.
NOVA's goal is to provide a high-performance, full-featured, production-ready
file system tailored for byte-addressable non-volatile memories (e.g., NVDIMMs
and Intel's soon-to-be-released 3DXpoint DIMMs). It combines design elements
from many other file systems to provide a combination of high-performance,
strong consistency guarantees, and comprehensive data protection. NOVA supports
DAX-style mmap, and making DAX perform well is a first-order priority in NOVA's
design.
NOVA was developed at the Non-Volatile Systems Laboratory in the Computer
Science and Engineering Department at the University of California, San Diego.
Its primary authors are Andiry Xu <jix024(a)eng.ucsd.edu>, Lu Zhang
<luzh(a)eng.ucsd.edu>, and Steven Swanson <swanson(a)eng.ucsd.edu>.
NOVA is stable enough to run complex applications, but there is substantial
work left to do. This RFC is intended to gather feedback to guide its
development toward eventual inclusion upstream.
The patches are relative Linux 4.12.
Overview
========
NOVA is primarily a log-structured file system, but rather than maintain a
single global log for the entire file system, it maintains separate logs for
each file (inode). NOVA breaks the logs into 4KB pages, they need not be
contiguous in memory. The logs only contain metadata.
File data pages reside outside the log, and log entries for write operations
point to data pages they modify. File modification uses copy-on-write (COW) to
provide atomic file updates.
For file operations that involve multiple inodes, NOVA use small, fixed-sized
redo logs to atomically append log entries to the logs of the inodes involved.
This structure keeps logs small and makes garbage collection very fast. It also
enables enormous parallelism during recovery from an unclean unmount, since
threads can scan logs in parallel.
NOVA replicates and checksums all metadata structures and protects file data
with RAID-4-style parity. It supports checkpoints to facilitate backups.
Documentation/filesystems/NOVA.txt contains some lower-level implementation and
usage information. A more thorough discussion of NOVA's goals and design is
avaialable in two papers:
NOVA: A Log-structured File system for Hybrid Volatile/Non-volatile Main Memories
http://cseweb.ucsd.edu/~swanson/papers/FAST2016NOVA.pdf
Jian Xu and Steven Swanson
Published in FAST 2016
Hardening the NOVA File System
http://cseweb.ucsd.edu/~swanson/papers/TechReport2017HardenedNOVA.pdf UCSD-CSE
Techreport CS2017-1018
Jian Xu, Lu Zhang, Amirsaman Memaripour, Akshatha
Gangadharaiah, Amit Borase, Tamires Brito Da Silva, Andy Rudoff, Steven
Swanson
-steve
---
Steven Swanson (16):
NOVA: Documentation
NOVA: Superblock and fs layout
NOVA: PMEM allocation system
NOVA: Inode operations and structures
NOVA: Log data structures and operations
NOVA: Lite-weight journaling for complex ops
NOVA: File and directory operations
NOVA: Garbage collection
NOVA: DAX code
NOVA: File data protection
NOVA: Snapshot support
NOVA: Recovery code
NOVA: Sysfs and ioctl
NOVA: Read-only pmem devices
NOVA: Performance measurement
NOVA: Build infrastructure
Documentation/filesystems/00-INDEX | 2
Documentation/filesystems/nova.txt | 771 +++++++++++++++++
MAINTAINERS | 8
README.md | 173 ++++
arch/x86/include/asm/io.h | 1
arch/x86/mm/fault.c | 11
arch/x86/mm/ioremap.c | 25 -
drivers/nvdimm/pmem.c | 14
fs/Kconfig | 2
fs/Makefile | 1
fs/nova/Kconfig | 15
fs/nova/Makefile | 9
fs/nova/balloc.c | 827 +++++++++++++++++++
fs/nova/balloc.h | 118 +++
fs/nova/bbuild.c | 1602 ++++++++++++++++++++++++++++++++++++
fs/nova/checksum.c | 912 ++++++++++++++++++++
fs/nova/dax.c | 1346 ++++++++++++++++++++++++++++++
fs/nova/dir.c | 760 +++++++++++++++++
fs/nova/file.c | 943 +++++++++++++++++++++
fs/nova/gc.c | 739 +++++++++++++++++
fs/nova/inode.c | 1467 +++++++++++++++++++++++++++++++++
fs/nova/inode.h | 389 +++++++++
fs/nova/ioctl.c | 185 ++++
fs/nova/journal.c | 474 +++++++++++
fs/nova/journal.h | 61 +
fs/nova/log.c | 1411 ++++++++++++++++++++++++++++++++
fs/nova/log.h | 333 +++++++
fs/nova/mprotect.c | 604 ++++++++++++++
fs/nova/mprotect.h | 190 ++++
fs/nova/namei.c | 919 +++++++++++++++++++++
fs/nova/nova.h | 1137 ++++++++++++++++++++++++++
fs/nova/nova_def.h | 154 +++
fs/nova/parity.c | 411 +++++++++
fs/nova/perf.c | 594 +++++++++++++
fs/nova/perf.h | 96 ++
fs/nova/rebuild.c | 847 +++++++++++++++++++
fs/nova/snapshot.c | 1407 ++++++++++++++++++++++++++++++++
fs/nova/snapshot.h | 98 ++
fs/nova/stats.c | 685 +++++++++++++++
fs/nova/stats.h | 218 +++++
fs/nova/super.c | 1222 +++++++++++++++++++++++++++
fs/nova/super.h | 216 +++++
fs/nova/symlink.c | 153 +++
fs/nova/sysfs.c | 543 ++++++++++++
include/linux/io.h | 2
include/linux/mm.h | 2
include/linux/mm_types.h | 3
kernel/memremap.c | 24 +
mm/memory.c | 2
mm/mmap.c | 1
mm/mprotect.c | 13
51 files changed, 22129 insertions(+), 11 deletions(-)
create mode 100644 Documentation/filesystems/nova.txt
create mode 100644 README.md
create mode 100644 fs/nova/Kconfig
create mode 100644 fs/nova/Makefile
create mode 100644 fs/nova/balloc.c
create mode 100644 fs/nova/balloc.h
create mode 100644 fs/nova/bbuild.c
create mode 100644 fs/nova/checksum.c
create mode 100644 fs/nova/dax.c
create mode 100644 fs/nova/dir.c
create mode 100644 fs/nova/file.c
create mode 100644 fs/nova/gc.c
create mode 100644 fs/nova/inode.c
create mode 100644 fs/nova/inode.h
create mode 100644 fs/nova/ioctl.c
create mode 100644 fs/nova/journal.c
create mode 100644 fs/nova/journal.h
create mode 100644 fs/nova/log.c
create mode 100644 fs/nova/log.h
create mode 100644 fs/nova/mprotect.c
create mode 100644 fs/nova/mprotect.h
create mode 100644 fs/nova/namei.c
create mode 100644 fs/nova/nova.h
create mode 100644 fs/nova/nova_def.h
create mode 100644 fs/nova/parity.c
create mode 100644 fs/nova/perf.c
create mode 100644 fs/nova/perf.h
create mode 100644 fs/nova/rebuild.c
create mode 100644 fs/nova/snapshot.c
create mode 100644 fs/nova/snapshot.h
create mode 100644 fs/nova/stats.c
create mode 100644 fs/nova/stats.h
create mode 100644 fs/nova/super.c
create mode 100644 fs/nova/super.h
create mode 100644 fs/nova/symlink.c
create mode 100644 fs/nova/sysfs.c
--
Signature
4 years, 7 months
[PATCH v2 0/4] dax: require 'struct page' and other fixups
by Dan Williams
Changes since v1 [1]:
* quiet bdev_dax_supported() in favor of error messages emitted by the
caller (Jeff)
* fix leftover parentheses in vma_merge (Jeff)
* improve the changelog for "dax: stop using VM_MIXEDMAP for dax"
[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-September/012645.html
---
Prompted by a recent change to add more protection around setting up
'vm_flags' for a dax vma [1], rework the implementation to remove the
requirement to set VM_MIXEDMAP and VM_HUGEPAGE.
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
This patch series passes a run of the ndctl unit test suite and the
'mmap.sh' [2] test in particular. 'mmap.sh' tries to catch dependencies
on VM_MIXEDMAP and {pte,pmd}_devmap().
[1]: https://lkml.org/lkml/2017/9/25/638
[2]: https://github.com/pmem/ndctl/blob/master/test/mmap.sh
---
Dan Williams (4):
dax: quiet bdev_dax_supported()
dax: disable filesystem dax on devices that do not map pages
dax: stop using VM_MIXEDMAP for dax
dax: stop using VM_HUGEPAGE for dax
drivers/dax/device.c | 1 -
drivers/dax/super.c | 15 +++++++++++----
fs/ext2/file.c | 1 -
fs/ext4/file.c | 1 -
fs/xfs/xfs_file.c | 2 --
mm/huge_memory.c | 8 ++++----
mm/ksm.c | 3 +++
mm/madvise.c | 2 +-
mm/memory.c | 20 ++++++++++++++++++--
mm/migrate.c | 3 ++-
mm/mlock.c | 3 ++-
mm/mmap.c | 3 ++-
12 files changed, 43 insertions(+), 19 deletions(-)
4 years, 7 months