[PATCH v2 0/3] Add support for memcpy_mcsafe
by Balbir Singh
memcpy_mcsafe() is an API currently used by the pmem subsystem to convert
errors while doing a memcpy (machine check exception errors) to a return
value. This patchset consists of three patches
1. The first patch is a bug fix to handle machine check errors correctly
while walking the page tables in kernel mode, due to huge pmd/pud sizes
2. The second patch adds memcpy_mcsafe() support, this is largely derived
from existing code
3. The third patch registers for callbacks on machine check exceptions and
in them uses specialized knowledge of the type of page to decide whether
to handle the MCE as is or to return to a fixup address present in
memcpy_mcsafe(). If a fixup address is used, then we return an error
value of -EFAULT to the caller.
Testing
A large part of the testing was done under a simulator by selectively
inserting machine check exceptions in a test driver doing memcpy_mcsafe
via ioctls.
Changelog v2
- Fix the logic of shifting in addr_to_pfn
- Use shift consistently instead of PAGE_SHIFT
- Fix a typo in patch1
Balbir Singh (3):
powerpc/mce: Bug fixes for MCE handling in kernel space
powerpc/memcpy: Add memcpy_mcsafe for pmem
powerpc/mce: Handle memcpy_mcsafe
arch/powerpc/include/asm/mce.h | 3 +-
arch/powerpc/include/asm/string.h | 2 +
arch/powerpc/kernel/mce.c | 77 ++++++++++++-
arch/powerpc/kernel/mce_power.c | 26 +++--
arch/powerpc/lib/Makefile | 2 +-
arch/powerpc/lib/memcpy_mcsafe_64.S | 212 ++++++++++++++++++++++++++++++++++++
6 files changed, 308 insertions(+), 14 deletions(-)
create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S
--
2.13.6
2 years, 5 months
[PATCH 0/5] fix radix tree multi-order iteration race
by Ross Zwisler
The following series gets the radix tree test suite compiling again in
the current linux/master, adds a unit test which exposes a race in the
radix tree multi-order iteration code, and then fixes that race.
This race was initially hit on a v4.15 based kernel and results in a GP
fault. I've described the race in detail in patches 4 and 5.
The fix is simple and necessary, and I think it should be merged for
v4.17.
This tree has gotten positive build confirmation from the 0-day bot,
passes the updated radix tree test suite, xfstests, and the original
test that was hitting the race with the v4.15 based kernel.
Ross Zwisler (5):
radix tree test suite: fix mapshift build target
radix tree test suite: fix compilation issue
radix tree test suite: add item_delete_rcu()
radix tree test suite: multi-order iteration race
radix tree: fix multi-order iteration race
lib/radix-tree.c | 6 ++--
tools/include/linux/spinlock.h | 3 +-
tools/testing/radix-tree/Makefile | 6 ++--
tools/testing/radix-tree/multiorder.c | 63 +++++++++++++++++++++++++++++++++++
tools/testing/radix-tree/test.c | 19 +++++++++++
tools/testing/radix-tree/test.h | 3 ++
6 files changed, 91 insertions(+), 9 deletions(-)
--
2.14.3
2 years, 6 months
Re: Detecting NUMA per pmem
by Oren Berman
Hi Ross
Thanks for the speedy reply. I am also adding the public list to this
thread as you suggested.
We have tried to dump the SPA table and this is what we get:
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20160108-64
* Copyright (c) 2000 - 2016 Intel Corporation
*
* Disassembly of NFIT, Sun Oct 22 10:46:19 2017
*
* ACPI Data Table [NFIT]
*
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
*/
[000h 0000 4] Signature : "NFIT" [NVDIMM Firmware
Interface Table]
[004h 0004 4] Table Length : 00000028
[008h 0008 1] Revision : 01
[009h 0009 1] Checksum : B2
[00Ah 0010 6] Oem ID : "SUPERM"
[010h 0016 8] Oem Table ID : "SMCI--MB"
[018h 0024 4] Oem Revision : 00000001
[01Ch 0028 4] Asl Compiler ID : " "
[020h 0032 4] Asl Compiler Revision : 00000001
[024h 0036 4] Reserved : 00000000
Raw Table Data: Length 40 (0x28)
0000: 4E 46 49 54 28 00 00 00 01 B2 53 55 50 45 52 4D // NFIT(.....SUPERM
0010: 53 4D 43 49 2D 2D 4D 42 01 00 00 00 01 00 00 00 // SMCI--MB........
0020: 01 00 00 00 00 00 00 00
As you can see the memory region info is missing.
This specific check was done on a supermicro server.
We also performed a bios update but the results were the same.
As said before ,the pmem devices are detected correctly and we verified
that they correspond to different numa nodes using the PCM utility.However,
linux still reports both pmem devices to be on the same numa - Numa 0.
If this information is missing, why pmem devices and address ranges are
still detected correctly?
Is there another table that we need to check?
I also ran dmidecode and the NVDIMMs are being listed (we tested with
netlist NVDIMMs). I can also see the bank locator showing P0 and P1 which I
think indicates the numa. Here is an example:
Handle 0x002D, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x002A
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: P1-DIMMA3
Bank Locator: P0_Node0_Channel0_Dimm2
Type: DDR4
Type Detail: Synchronous
Speed: 2400 MHz
Manufacturer: Netlist
Serial Number: 66F50006
Asset Tag: P1-DIMMA3_AssetTag (date:16/42)
Part Number: NV3A74SBT20-000
Rank: 1
Configured Clock Speed: 1600 MHz
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x003B, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0038
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: P2-DIMME3
Bank Locator: P1_Node1_Channel0_Dimm2
Type: DDR4
Type Detail: Synchronous
Speed: 2400 MHz
Manufacturer: Netlist
Serial Number: 66B50010
Asset Tag: P2-DIMME3_AssetTag (date:16/42)
Part Number: NV3A74SBT20-000
Rank: 1
Configured Clock Speed: 1600 MHz
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Did you encounter such a a case? We would appreciate any insight you might
have.
BR
Oren Berman
On 20 October 2017 at 19:22, Ross Zwisler <ross.zwisler(a)linux.intel.com>
wrote:
> On Thu, Oct 19, 2017 at 06:12:24PM +0300, Oren Berman wrote:
> > Hi Ross
> > My name is Oren Berman and I am a senior developer at lightbitslabs.
> > We are working with NDIMMs but we encountered a problem that the
> kernel
> > does not seem to detect the numa id per PMEM device.
> > It always reports numa 0 although we have NVDIMM devices on both
> nodes.
> > We checked that it always returns 0 from sysfs and also from
> retrieving
> > the device of pmem in the kernel and calling dev_to_node.
> > The result is always 0 for both pmem0 and pmem1.
> > In order to make sure that indeed both numa sockets are used we ran
> > intel's pcm utlity. We verified that writing to pmem 0 increases
> socket 0
> > utilization and writing to pmem1 increases socket 1 utilization so
> the hw
> > works properly.
> > Only the detection seems to be invalid.
> > Did you encounter such a problem?
> > We are using kernel version 4.9 - are you aware of any fix for this
> issue
> > or workaround that we can use.
> > Are we missing something?
> > Thanks for any help you can give us.
> > BR
> > Oren Berman
>
> Hi Oren,
>
> My first guess is that your platform isn't properly filling out the
> "proximity
> domain" field in the NFIT SPA table.
>
> See section 5.2.25.2 in ACPI 6.2:
> http://uefi.org/sites/default/files/resources/ACPI_6_2.pdf
>
> Here's how to check that:
>
> # cd /tmp
> # cp /sys/firmware/acpi/tables/NFIT .
> # iasl NFIT
>
> Intel ACPI Component Architecture
> ASL+ Optimizing Compiler version 20160831-64
> Copyright (c) 2000 - 2016 Intel Corporation
>
> Binary file appears to be a valid ACPI table, disassembling
> Input file NFIT, Length 0xE0 (224) bytes
> ACPI: NFIT 0x0000000000000000 0000E0 (v01 BOCHS BXPCNFIT 00000001 BXPC
> 00000001)
> Acpi Data Table [NFIT] decoded
> Formatted output: NFIT.dsl - 5191 bytes
>
> This will give you an NFIT.dsl file which you can look at. Here is what my
> SPA table looks like for an emulated QEMU NVDIMM:
>
> [028h 0040 2] Subtable Type : 0000 [System Physical
> Address Range]
> [02Ah 0042 2] Length : 0038
>
> [02Ch 0044 2] Range Index : 0002
> [02Eh 0046 2] Flags (decoded below) : 0003
> Add/Online Operation Only : 1
> Proximity Domain Valid : 1
> [030h 0048 4] Reserved : 00000000
> [034h 0052 4] Proximity Domain : 00000000
> [038h 0056 16] Address Range GUID :
> 66F0D379-B4F3-4074-AC43-0D3318B78CDB
> [048h 0072 8] Address Range Base : 0000000240000000
> [050h 0080 8] Address Range Length : 0000000440000000
> [058h 0088 8] Memory Map Attribute : 0000000000008008
>
> So, the "Proximity Domain" field is 0, and this lets the system know which
> NUMA node to associate with this memory region.
>
> BTW, in the future it's best to CC our public list,
> linux-nvdimm(a)lists.01.org,
> as a) someone else might have the same question and b) someone else might
> know
> the answer.
>
> Thanks,
> - Ross
>
2 years, 6 months
[PATCH v2 0/7] Fix DM DAX handling
by Ross Zwisler
Changes from v1:
* Reworked patches 1 and 2 so that the __bdev_dax_supported() function
stays hidden behind the bdev_dax_supported() wrapper. This is needed
to prevent compilation errors in configs where CONFIG_FS_DAX isn't
defined. (0-day)
* Added Eric's Reviewed-by to patch 1. I did this in spite of the
bdev_dax_supported() changes because they were minor and I think
Eric's review was focused on the XFS parts.
---
This series fixes a few issues that I found with DM's handling of DAX
devices. Here are some of the issues I found:
* We can create a dm-stripe or dm-linear device which is made up of an
fsdax PMEM namespace and a raw PMEM namespace but which can hold a
filesystem mounted with the -o dax mount option. DAX operations to
the raw PMEM namespace part lack struct page and can fail in
interesting/unexpected ways when doing things like fork(), examining
memory with gdb, etc.
* We can create a dm-stripe or dm-linear device which is made up of an
fsdax PMEM namespace and a BRD ramdisk which can hold a filesystem
mounted with the -o dax mount option. All I/O to this filesystem
will fail.
* In DM you can't transition a dm target which could possibly support
DAX (mode DM_TYPE_DAX_BIO_BASED) to one which can't support DAX
(mode DM_TYPE_BIO_BASED), even if you never use DAX.
The first 2 patches in this series are prep work from Darrick and Dave
which improve bdev_dax_supported(). The last 5 problems fix the above
mentioned problems in DM. I feel that this series simplifies the
handling of DAX devices in DM, and the last 5 DM-related patches have a
net code reduction of 50 lines.
Darrick J. Wong (1):
fs: allow per-device dax status checking for filesystems
Dave Jiang (1):
dax: change bdev_dax_supported() to support boolean returns
Ross Zwisler (5):
dm: fix test for DAX device support
dm: prevent DAX mounts if not supported
dm: remove DM_TYPE_DAX_BIO_BASED dm_queue_mode
dm-snap: remove unnecessary direct_access() stub
dm-error: remove unnecessary direct_access() stub
drivers/dax/super.c | 40 ++++++++++++++++++++--------------------
drivers/md/dm-ioctl.c | 16 ++++++----------
drivers/md/dm-snap.c | 8 --------
drivers/md/dm-table.c | 29 +++++++++++------------------
drivers/md/dm-target.c | 7 -------
drivers/md/dm.c | 7 ++-----
fs/ext2/super.c | 3 +--
fs/ext4/super.c | 3 +--
fs/xfs/xfs_ioctl.c | 3 ++-
fs/xfs/xfs_iops.c | 30 +++++++++++++++++++++++++-----
fs/xfs/xfs_super.c | 10 ++++++++--
include/linux/dax.h | 11 ++++++-----
include/linux/device-mapper.h | 8 ++++++--
13 files changed, 88 insertions(+), 87 deletions(-)
--
2.14.3
2 years, 7 months
[PATCH v3 0/2] Support ACPI 6.1 update in NFIT Control Region Structure
by Toshi Kani
ACPI 6.1, Table 5-133, updates NVDIMM Control Region Structure as
follows.
- Valid Fields, Manufacturing Location, and Manufacturing Date
are added from reserved range. No change in the structure size.
- IDs (SPD values) are stored as arrays of bytes (i.e. big-endian
format). The spec clarifies that they need to be represented
as arrays of bytes as well.
Patch 1 changes the NFIT driver to comply with ACPI 6.1.
Patch 2 adds a new sysfs file "id" to show NVDIMM ID defined in ACPI 6.1.
The patch-set applies on linux-pm.git acpica.
link: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
---
v3:
- Need to coordinate with ACPICA update (Bob Moore, Dan Williams)
- Integrate with ACPICA changes in struct acpi_nfit_control_region.
(commit 138a95547ab0)
v2:
- Remove 'mfg_location' and 'mfg_date'. (Dan Williams)
- Rename 'unique_id' to 'id' and make this change as a separate patch.
(Dan Williams)
---
Toshi Kani (3):
1/2 acpi/nfit: Update nfit driver to comply with ACPI 6.1
2/3 acpi/nfit: Add sysfs "id" for NVDIMM ID
---
drivers/acpi/nfit.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
2 years, 7 months
[PATCH v11 0/7] dax: fix dma vs truncate/hole-punch
by Dan Williams
Changes since v9 [1] and v10 [2]
* Resend the full series with the reworked "mm: introduce
MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS" (Christoph)
* Move generic_dax_pagefree() into the pmem driver (Christoph)
* Cleanup __bdev_dax_supported() (Christoph)
* Cleanup some stale SRCU bits leftover from other iterations (Jan)
* Cleanup xfs_break_layouts() (Jan)
[1]: https://lists.01.org/pipermail/linux-nvdimm/2018-April/015457.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2018-May/015885.html
---
Background:
get_user_pages() in the filesystem pins file backed memory pages for
access by devices performing dma. However, it only pins the memory pages
not the page-to-file offset association. If a file is truncated the
pages are mapped out of the file and dma may continue indefinitely into
a page that is owned by a device driver. This breaks coherency of the
file vs dma, but the assumption is that if userspace wants the
file-space truncated it does not matter what data is inbound from the
device, it is not relevant anymore. The only expectation is that dma can
safely continue while the filesystem reallocates the block(s).
Problem:
This expectation that dma can safely continue while the filesystem
changes the block map is broken by dax. With dax the target dma page
*is* the filesystem block. The model of leaving the page pinned for dma,
but truncating the file block out of the file, means that the filesytem
is free to reallocate a block under active dma to another file and now
the expected data-incoherency situation has turned into active
data-corruption.
Solution:
Defer all filesystem operations (fallocate(), truncate()) on a dax mode
file while any page/block in the file is under active dma. This solution
assumes that dma is transient. Cases where dma operations are known to
not be transient, like RDMA, have been explicitly disabled via
commits like 5f1d43de5416 "IB/core: disable memory registration of
filesystem-dax vmas".
The dax_layout_busy_page() routine is called by filesystems with a lock
held against mm faults (i_mmap_lock) to find pinned / busy dax pages.
The process of looking up a busy page invalidates all mappings
to trigger any subsequent get_user_pages() to block on i_mmap_lock.
The filesystem continues to call dax_layout_busy_page() until it finally
returns no more active pages. This approach assumes that the page
pinning is transient, if that assumption is violated the system would
have likely hung from the uncompleted I/O.
---
Dan Williams (7):
memremap: split devm_memremap_pages() and memremap() infrastructure
mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
mm: fix __gup_device_huge vs unmap
mm, fs, dax: handle layout changes to pinned dax mappings
xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
xfs: prepare xfs_break_layouts() for another layout type
xfs, dax: introduce xfs_break_dax_layouts()
drivers/dax/super.c | 14 ++-
drivers/nvdimm/pfn_devs.c | 2
drivers/nvdimm/pmem.c | 25 +++++
fs/Kconfig | 1
fs/dax.c | 97 +++++++++++++++++++++
fs/xfs/xfs_file.c | 72 ++++++++++++++--
fs/xfs/xfs_inode.h | 16 +++
fs/xfs/xfs_ioctl.c | 8 --
fs/xfs/xfs_iops.c | 16 ++-
fs/xfs/xfs_pnfs.c | 15 ++-
fs/xfs/xfs_pnfs.h | 5 +
include/linux/dax.h | 7 ++
include/linux/memremap.h | 36 ++------
include/linux/mm.h | 71 +++++++++++----
kernel/Makefile | 3 -
kernel/iomem.c | 167 ++++++++++++++++++++++++++++++++++++
kernel/memremap.c | 209 ++++++---------------------------------------
mm/Kconfig | 5 +
mm/gup.c | 36 ++++++--
mm/hmm.c | 13 ---
mm/swap.c | 3 -
21 files changed, 542 insertions(+), 279 deletions(-)
create mode 100644 kernel/iomem.c
2 years, 7 months
[PATCH v6 0/4] ndctl, monitor: add ndctl monitor daemon
by QI Fuli
This is the v6 patch for ndctl monitor daemon, a tiny daemon to monitor
the smart events of nvdimm DIMMs. Users can run a monitor as a one-shot
command or a daemon in background by using the [--daemon] option. DIMMs
to monitor can be selected by [--dimm] [--bus] [--region] [--namespace]
options, these options support multiple space-seperated arguments.
When a smart event fires, monitor daemon will log the notifications
which including dimm health status to syslog or a logfile by setting
[--logfile=<file|syslog>] option. monitor also can output the
notifications to stderr when it run as one-shot command by setting
[--logfile=<stderr>]. The notifications follow json format and can be
consumed by log collectors like Fluentd. Users can change the
configuration of monitor by editing the default configuration file
/etc/ndctl/monitor.conf or by using [--config-file=<file>] option to
override the default configuration.
Users can start a monitor daemon by the following command:
# ndctl monitor --daemon --logfile /var/log/ndctl/monitor.log
Also, a monitor daemon can be started by systemd:
# systemctl start ndctl-monitor.service
In this case, monitor daemon follows the default configuration file
/etc/ndctl/monitor.conf.
Signed-off-by: QI Fuli <qi.fuli(a)jp.fujitsu.com>
---
Change log since v5:
- Fixing systemd unit file cannot be installed bug
- Adding license to ./util/abspath.c
Change log since v4:
- Adding OPTION_FILENAME to make sure filename is correct
- Adding configuration file
- Adding [--config-file] option to override the default configuration
- Making some options support multiple space-seperated arguments
- Making systemctl enable ndctl-monitor.service command work
- Making systemctl restart ndctl-monitor.service command work
- Making the directory of systemd unit file to be configurable
- Changing log_file() and log_syslog() to logreport()
- Changing date format in notification to nanoseconds since epoch
- Changing select() to epoll()
- Adding filter_bus() and filter_region()
Change log since v3:
- Removing create-monitor, show-monitor, list-monitor, destroy-monitor
- Adding [--daemon] option to run ndctl monitor as a daemon
- Using systemd to manage ndctl monitor daemon
- Replacing filter_monitor_dimm() with filter_dimm()
Change log since v2:
- Changing the interface of daemon to the ndctl command line
- Changing the name of daemon form "nvdimmd" to "monitor"
- Removing the config file, unit_file, nvdimmd dir
- Removing nvdimmd_test program
- Adding ndctl/monitor.c
Change log since v1:
- Adding a config file(/etc/nvdimmd/nvdimmd.conf)
- Using struct log_ctx instead of syslog()
- Using log_syslog() to save the notify messages to syslog
- Using log_file() to save the notify messages to special file
- Adding LOG_NOTICE level to log_priority
- Using automake instead of Makefile
- Adding a new util file(nvdimmd/util.c) including helper functions
needed for nvdimm daemon
- Adding nvdimmd_test program
QI Fuli (4):
ndctl, util: add OPTION_FILENAME to parse_opt_type
ndctl, monitor: add ndctl monitor daemon
ndctl, monitor: add default configuration file
ndctl, monitor: add the unit file of systemd for ndctl-monitor service
Makefile.am | 3 +-
autogen.sh | 3 +-
builtin.h | 1 +
configure.ac | 22 ++
ndctl/Makefile.am | 12 +-
ndctl/monitor.c | 460 ++++++++++++++++++++++++++++++++++++
ndctl/monitor.conf | 37 +++
ndctl/ndctl-monitor.service | 7 +
ndctl/ndctl.c | 1 +
util/abspath.c | 29 +++
util/help.c | 5 -
util/parse-options.c | 47 +++-
util/parse-options.h | 11 +-
util/util.h | 7 +
14 files changed, 629 insertions(+), 16 deletions(-)
create mode 100644 ndctl/monitor.c
create mode 100644 ndctl/monitor.conf
create mode 100644 ndctl/ndctl-monitor.service
create mode 100644 util/abspath.c
--
2.17.0.140.g0b0cc9f86
2 years, 7 months
[qemu PATCH v4 0/4] support NFIT platform capabilities
by Ross Zwisler
Changes since v3:
* Updated the text in docs/nvdimm.txt to make it clear that the value
being passed in on the command line in an integer made up of various
bit fields. (Rob Elliott)
* Updated the "Highest Valid Capability" byte to be dynamic based on
the highest valid bit in the user's input. (Rob Elliott)
---
The first 2 patches in this series clean up some things I noticed while
coding.
Patch 3 adds support for the new Platform Capabilities Structure, which
was added to the NFIT in ACPI 6.2 Errata A. We add a machine command
line option "nvdimm-cap":
-machine pc,accel=kvm,nvdimm,nvdimm-cap=2
which allows the user to pass in a value for this structure. When such
a value is passed in we will generate the new NFIT subtable.
Patch 4 adds code to the "make check" self test infrastructure so that
we generate the new Platform Capabilities Structure, and adds it to the
expected NFIT output so that we test for it.
Ross Zwisler (4):
nvdimm: fix typo in label-size definition
tests/.gitignore: add entry for generated file
nvdimm, acpi: support NFIT platform capabilities
ACPI testing: test NFIT platform capabilities
docs/nvdimm.txt | 27 ++++++++++++++++++++
hw/acpi/nvdimm.c | 45 +++++++++++++++++++++++++++++++---
hw/i386/pc.c | 31 +++++++++++++++++++++++
hw/mem/nvdimm.c | 2 +-
include/hw/i386/pc.h | 1 +
include/hw/mem/nvdimm.h | 7 +++++-
tests/.gitignore | 1 +
tests/acpi-test-data/pc/NFIT.dimmpxm | Bin 224 -> 240 bytes
tests/acpi-test-data/q35/NFIT.dimmpxm | Bin 224 -> 240 bytes
tests/bios-tables-test.c | 2 +-
10 files changed, 109 insertions(+), 7 deletions(-)
--
2.14.3
2 years, 7 months
Question about Experimental of Filesystem DAX.
by Yasunori Goto
Hello,
I would like to know about the Experimental message of Filesystem DAX.
--------------------------------------------------------
DAX enabled. Warning: EXPERIMENTAL, use at your own risk
--------------------------------------------------------
AFAIK, the final issue of Filesystem DAX is metadata update problem,
and it is(will be?) solved by great effort of MAP_SYNC and
"fix dma vs truncate/hole-punch" patch set.
So, I suppose that the Experimental message can be removed,
but I'm not sure.
Is it possible?
Otherwise, are there any other issues in Filesystem DAX yet?
If this is silly question, sorry for noise....
Thanks,
---
Yasunori Goto
2 years, 7 months
[ndctl PATCH v2] ndctl, list: display the 'map' location in listings
by Vishal Verma
For 'fsdax' and 'devdax' namespaces, a 'map' location may be specified
for page structures storage. This can be 'mem', for system RAM, or 'dev'
for using pmem as the backing storage. Once set, there was no way of
telling using ndctl, which of the two locations a namespace was
configured for. Add this in util_namespace_to_json so that all
namespace listings contain the map location.
Reported-by: "Yigal Korman" <yigal.korman(a)netapp.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
util/json.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
v2: Also account for memmap=ss!nn or legacy-e820 namespaces. (Dan)
diff --git a/util/json.c b/util/json.c
index c606e1c..b020300 100644
--- a/util/json.c
+++ b/util/json.c
@@ -667,11 +667,17 @@ struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
{
struct json_object *jndns = json_object_new_object();
struct json_object *jobj, *jbbs = NULL;
+ const char *locations[] = {
+ [NDCTL_PFN_LOC_NONE] = "none",
+ [NDCTL_PFN_LOC_RAM] = "mem",
+ [NDCTL_PFN_LOC_PMEM] = "dev",
+ };
unsigned long long size = ULLONG_MAX;
unsigned int sector_size = UINT_MAX;
enum ndctl_namespace_mode mode;
const char *bdev = NULL, *name;
unsigned int bb_count = 0;
+ enum ndctl_pfn_loc loc;
struct ndctl_btt *btt;
struct ndctl_pfn *pfn;
struct ndctl_dax *dax;
@@ -693,33 +699,49 @@ struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
mode = ndctl_namespace_get_mode(ndns);
switch (mode) {
case NDCTL_NS_MODE_MEMORY:
- if (pfn) /* dynamic memory mode */
+ jobj = json_object_new_string("fsdax");
+ if (jobj)
+ json_object_object_add(jndns, "mode", jobj);
+ loc = ndctl_pfn_get_location(pfn);
+ if (pfn) { /* dynamic memory mode */
size = ndctl_pfn_get_size(pfn);
- else /* native/static memory mode */
+ jobj = json_object_new_string(locations[loc]);
+ } else { /* native/static memory mode */
size = ndctl_namespace_get_size(ndns);
- jobj = json_object_new_string("fsdax");
+ jobj = json_object_new_string("mem");
+ }
+ if (jobj)
+ json_object_object_add(jndns, "map", jobj);
break;
case NDCTL_NS_MODE_DAX:
if (!dax)
goto err;
size = ndctl_dax_get_size(dax);
jobj = json_object_new_string("devdax");
+ if (jobj)
+ json_object_object_add(jndns, "mode", jobj);
+ loc = ndctl_dax_get_location(dax);
+ jobj = json_object_new_string(locations[loc]);
+ if (jobj)
+ json_object_object_add(jndns, "map", jobj);
break;
case NDCTL_NS_MODE_SAFE:
if (!btt)
goto err;
jobj = json_object_new_string("sector");
+ if (jobj)
+ json_object_object_add(jndns, "mode", jobj);
size = ndctl_btt_get_size(btt);
break;
case NDCTL_NS_MODE_RAW:
size = ndctl_namespace_get_size(ndns);
jobj = json_object_new_string("raw");
+ if (jobj)
+ json_object_object_add(jndns, "mode", jobj);
break;
default:
jobj = NULL;
}
- if (jobj)
- json_object_object_add(jndns, "mode", jobj);
if (size < ULLONG_MAX) {
jobj = util_json_object_size(size, flags);
--
2.17.0
2 years, 7 months