[PATCH v3 0/2] Support ACPI 6.1 update in NFIT Control Region Structure
by Toshi Kani
ACPI 6.1, Table 5-133, updates NVDIMM Control Region Structure as
follows.
- Valid Fields, Manufacturing Location, and Manufacturing Date
are added from reserved range. No change in the structure size.
- IDs (SPD values) are stored as arrays of bytes (i.e. big-endian
format). The spec clarifies that they need to be represented
as arrays of bytes as well.
Patch 1 changes the NFIT driver to comply with ACPI 6.1.
Patch 2 adds a new sysfs file "id" to show NVDIMM ID defined in ACPI 6.1.
The patch-set applies on linux-pm.git acpica.
link: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
---
v3:
- Need to coordinate with ACPICA update (Bob Moore, Dan Williams)
- Integrate with ACPICA changes in struct acpi_nfit_control_region.
(commit 138a95547ab0)
v2:
- Remove 'mfg_location' and 'mfg_date'. (Dan Williams)
- Rename 'unique_id' to 'id' and make this change as a separate patch.
(Dan Williams)
---
Toshi Kani (3):
1/2 acpi/nfit: Update nfit driver to comply with ACPI 6.1
2/3 acpi/nfit: Add sysfs "id" for NVDIMM ID
---
drivers/acpi/nfit.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
3 years, 11 months
Enabling peer to peer device transactions for PCIe devices
by Deucher, Alexander
This is certainly not the first time this has been brought up, but I'd like to try and get some consensus on the best way to move this forward. Allowing devices to talk directly improves performance and reduces latency by avoiding the use of staging buffers in system memory. Also in cases where both devices are behind a switch, it avoids the CPU entirely. Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are pointer based. Ideally we'd be able to take a CPU virtual address and be able to get to a physical address taking into account IOMMUs, etc. Having struct pages for the memory would allow it to work more generally and wouldn't require as much explicit support in drivers that wanted to use it.
Some use cases:
1. Storage devices streaming directly to GPU device memory
2. GPU device memory to GPU device memory streaming
3. DVB/V4L/SDI devices streaming directly to GPU device memory
4. DVB/V4L/SDI devices streaming directly to storage devices
Here is a relatively simple example of how this could work for testing. This is obviously not a complete solution.
- Device memory will be registered with Linux memory sub-system by created corresponding struct page structures for device memory
- get_user_pages_fast() will return corresponding struct pages when CPU address points to the device memory
- put_page() will deal with struct pages for device memory
Previously proposed solutions and related proposals:
1.P2P DMA
DMA-API/PCI map_peer_resource support for peer-to-peer (http://www.spinics.net/lists/linux-pci/msg44560.html)
Pros: Low impact, already largely reviewed.
Cons: requires explicit support in all drivers that want to support it, doesn't handle S/G in device memory.
2. ZONE_DEVICE IO
Direct I/O and DMA for persistent memory (https://lwn.net/Articles/672457/)
Add support for ZONE_DEVICE IO memory with struct pages. (https://patchwork.kernel.org/patch/8583221/)
Pro: Doesn't waste system memory for ZONE metadata
Cons: CPU access to ZONE metadata slow, may be lost, corrupted on device reset.
3. DMA-BUF
RDMA subsystem DMA-BUF support (http://www.spinics.net/lists/linux-rdma/msg38748.html)
Pros: uses existing dma-buf interface
Cons: dma-buf is handle based, requires explicit dma-buf support in drivers.
4. iopmem
iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/)
5. HMM
Heterogeneous Memory Management (http://lkml.iu.edu/hypermail/linux/kernel/1611.2/02473.html)
6. Some new mmap-like interface that takes a userptr and a length and returns a dma-buf and offset?
Alex
4 years, 6 months
[RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
by Logan Gunthorpe
Hello,
As discussed at LSF/MM we'd like to present our work to enable
copy offload support in NVMe fabrics RDMA targets. We'd appreciate
some review and feedback from the community on our direction.
This series is not intended to go upstream at this point.
The concept here is to use memory that's exposed on a PCI BAR as
data buffers in the NVME target code such that data can be transferred
from an RDMA NIC to the special memory and then directly to an NVMe
device avoiding system memory entirely. The upside of this is better
QoS for applications running on the CPU utilizing memory and lower
PCI bandwidth required to the CPU (such that systems could be designed
with fewer lanes connected to the CPU). However, presently, the trade-off
is currently a reduction in overall throughput. (Largely due to hardware
issues that would certainly improve in the future).
Due to these trade-offs we've designed the system to only enable using
the PCI memory in cases where the NIC, NVMe devices and memory are all
behind the same PCI switch. This will mean many setups that could likely
work well will not be supported so that we can be more confident it
will work and not place any responsibility on the user to understand
their topology. (We've chosen to go this route based on feedback we
received at LSF).
In order to enable this functionality we introduce a new p2pmem device
which can be instantiated by PCI drivers. The device will register some
PCI memory as ZONE_DEVICE and provide an genalloc based allocator for
users of these devices to get buffers. We give an example of enabling
p2p memory with the cxgb4 driver, however currently these devices have
some hardware issues that prevent their use so we will likely be
dropping this patch in the future. Ideally, we'd want to enable this
functionality with NVME CMB buffers, however we don't have any hardware
with this feature at this time.
In nvmet-rdma, we attempt to get an appropriate p2pmem device at
queue creation time and if a suitable one is found we will use it for
all the (non-inlined) memory in the queue. An 'allow_p2pmem' configfs
attribute is also created which is required to be set before any p2pmem
is attempted.
This patchset also includes a more controversial patch which provides an
interface for userspace to obtain p2pmem buffers through an mmap call on
a cdev. This enables userspace to fairly easily use p2pmem with RDMA and
O_DIRECT interfaces. However, the user would be entirely responsible for
knowing what their doing and inspecting sysfs to understand the pci
topology and only using it in sane situations.
Thanks,
Logan
Logan Gunthorpe (6):
Introduce Peer-to-Peer memory (p2pmem) device
nvmet: Use p2pmem in nvme target
scatterlist: Modify SG copy functions to support io memory.
nvmet: Be careful about using iomem accesses when dealing with p2pmem
p2pmem: Support device removal
p2pmem: Added char device user interface
Steve Wise (2):
cxgb4: setup pcie memory window 4 and create p2pmem region
p2pmem: Add debugfs "stats" file
drivers/memory/Kconfig | 5 +
drivers/memory/Makefile | 2 +
drivers/memory/p2pmem.c | 697 ++++++++++++++++++++++++
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 3 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 97 +++-
drivers/net/ethernet/chelsio/cxgb4/t4_regs.h | 5 +
drivers/nvme/target/configfs.c | 31 ++
drivers/nvme/target/core.c | 18 +-
drivers/nvme/target/fabrics-cmd.c | 28 +-
drivers/nvme/target/nvmet.h | 2 +
drivers/nvme/target/rdma.c | 183 +++++--
drivers/scsi/scsi_debug.c | 7 +-
include/linux/p2pmem.h | 120 ++++
include/linux/scatterlist.h | 7 +-
lib/scatterlist.c | 64 ++-
15 files changed, 1189 insertions(+), 80 deletions(-)
create mode 100644 drivers/memory/p2pmem.c
create mode 100644 include/linux/p2pmem.h
--
2.1.4
5 years
[RFC PATCH] Report the Health Status Detail for the HPE1 DSM family
by Linda Knippers
Dan,
This is an RFC because I'd like some initial feedback on the
approach. I think this is what you had in mind from your last
exchanges with Brian but I wanted to check a few things before
going too far.
1) Do we want to export a library function for what could be a long
list of DSM-family-specific health information? I think there
could be some common information between the HPE1 and MSFT DSM
but much will not be common.
2) If we do export the functions, would we need to also export
the ndctl-hpe1.h include file or consolidate the information into
an already exported file?
3) Do you want json-smart.c to keep growing or should new smart
functions provide their own matching json functions?
4) The code in json-smart.c with a macro was a quick prototype
but if you have feedback on the json parts, that would be appreciated.
Right now the detail is reported as a string if all is well and an
array if there are errors. I'm not sure about that or whether
the strings should have spaces.
Anyway, here's the patch ...
This patch adds a new interface to provide Health Status Detail.
This field is reported as part of the Smart Health with the HPE1
DSM family so the function for the Intel family is NULL. If
the field is available, the ndctl --health option will decode
the bits that make up the field.
On a healthy device, the output would look something like:
{
"dev":"nmem0",
"id":"802c-01-1521-b300bdbc",
"health":{
"health_state":"ok",
"temperature_celsius":25.000000,
"spares_percentage":99,
"alarm_temperature":false,
"alarm_spares":false,
"temperature_threshold":50.000000,
"spares_threshold":20,
"life_used_percentage":2,
"shutdown_state":"clean",
"health_status_detail":"ok"
}
}
A device with every possible error could look like this:
{
"dev":"nmem0",
"id":"802c-01-1521-b300bdbc",
"health":{
"health_state":"ok",
"temperature_celsius":25.000000,
"spares_percentage":99,
"alarm_temperature":false,
"alarm_spares":false,
"temperature_threshold":50.000000,
"spares_threshold":20,
"life_used_percentage":2,
"shutdown_state":"clean",
"health_status_detail":[
"energy source error",
"controller error",
"UC ECC error",
"CE trip",
"save error",
"restore error",
"arm error",
"erase error",
"configuration error",
"firmware error",
"vendor specific error"
]
}
}
---
ndctl/lib/libndctl-hpe1.c | 12 ++++++++++++
ndctl/lib/libndctl-private.h | 1 +
ndctl/lib/libndctl-smart.c | 2 ++
ndctl/lib/libndctl.sym | 1 +
ndctl/libndctl.h.in | 5 +++++
ndctl/ndctl.h | 1 +
ndctl/util/json-smart.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 68 insertions(+)
diff --git a/ndctl/lib/libndctl-hpe1.c b/ndctl/lib/libndctl-hpe1.c
index ec54252..23b76a4 100644
--- a/ndctl/lib/libndctl-hpe1.c
+++ b/ndctl/lib/libndctl-hpe1.c
@@ -63,6 +63,7 @@ static struct ndctl_cmd *hpe1_dimm_cmd_new_smart(struct ndctl_dimm *dimm)
hpe1->u.smart.in_valid_flags |= NDN_HPE1_SMART_USED_VALID;
hpe1->u.smart.in_valid_flags |= NDN_HPE1_SMART_SHUTDOWN_VALID;
hpe1->u.smart.in_valid_flags |= NDN_HPE1_SMART_VENDOR_VALID;
+ hpe1->u.smart.in_valid_flags |= NDN_HPE1_SMART_DETAIL_VALID;
cmd->firmware_status = &hpe1->u.smart.status;
@@ -104,6 +105,8 @@ static unsigned int hpe1_cmd_smart_get_flags(struct ndctl_cmd *cmd)
flags |= ND_SMART_SHUTDOWN_VALID;
if (hpe1flags & NDN_HPE1_SMART_VENDOR_VALID)
flags |= ND_SMART_VENDOR_VALID;
+ if (hpe1flags & NDN_HPE1_SMART_DETAIL_VALID)
+ flags |= ND_SMART_DETAIL_VALID;
return flags;
}
@@ -282,6 +285,14 @@ static unsigned int hpe1_cmd_smart_threshold_get_spares(struct ndctl_cmd *cmd)
return CMD_HPE1_SMART_THRESH(cmd)->spare_block_threshold;
}
+static unsigned int hpe1_cmd_smart_get_detail(struct ndctl_cmd *cmd)
+{
+ if (hpe1_smart_valid(cmd) < 0)
+ return UINT_MAX;
+
+ return CMD_HPE1_SMART(cmd)->mod_hlth_stat;
+}
+
struct ndctl_smart_ops * const hpe1_smart_ops = &(struct ndctl_smart_ops) {
.new_smart = hpe1_dimm_cmd_new_smart,
@@ -298,4 +309,5 @@ struct ndctl_smart_ops * const hpe1_smart_ops = &(struct ndctl_smart_ops) {
.smart_threshold_get_alarm_control = hpe1_cmd_smart_threshold_get_alarm_control,
.smart_threshold_get_temperature = hpe1_cmd_smart_threshold_get_temperature,
.smart_threshold_get_spares = hpe1_cmd_smart_threshold_get_spares,
+ .smart_get_detail = hpe1_cmd_smart_get_detail,
};
diff --git a/ndctl/lib/libndctl-private.h b/ndctl/lib/libndctl-private.h
index 3e67db0..e379e7d 100644
--- a/ndctl/lib/libndctl-private.h
+++ b/ndctl/lib/libndctl-private.h
@@ -221,6 +221,7 @@ struct ndctl_smart_ops {
unsigned int (*smart_threshold_get_alarm_control)(struct ndctl_cmd *);
unsigned int (*smart_threshold_get_temperature)(struct ndctl_cmd *);
unsigned int (*smart_threshold_get_spares)(struct ndctl_cmd *);
+ unsigned int (*smart_get_detail)(struct ndctl_cmd *);
};
#if HAS_SMART == 1
diff --git a/ndctl/lib/libndctl-smart.c b/ndctl/lib/libndctl-smart.c
index 73a49ef..890fa47 100644
--- a/ndctl/lib/libndctl-smart.c
+++ b/ndctl/lib/libndctl-smart.c
@@ -63,6 +63,7 @@ smart_cmd_op(ndctl_cmd_smart_get_vendor_data, smart_get_vendor_data, unsigned ch
smart_cmd_op(ndctl_cmd_smart_threshold_get_alarm_control, smart_threshold_get_alarm_control, unsigned int, 0)
smart_cmd_op(ndctl_cmd_smart_threshold_get_temperature, smart_threshold_get_temperature, unsigned int, 0)
smart_cmd_op(ndctl_cmd_smart_threshold_get_spares, smart_threshold_get_spares, unsigned int, 0)
+smart_cmd_op(ndctl_cmd_smart_get_detail, smart_get_detail, unsigned int, 0)
/*
* The following intel_dimm_*() and intel_smart_*() functions implement
@@ -202,4 +203,5 @@ struct ndctl_smart_ops * const intel_smart_ops = &(struct ndctl_smart_ops) {
.smart_threshold_get_alarm_control = intel_cmd_smart_threshold_get_alarm_control,
.smart_threshold_get_temperature = intel_cmd_smart_threshold_get_temperature,
.smart_threshold_get_spares = intel_cmd_smart_threshold_get_spares,
+ .smart_get_detail = NULL,
};
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index be2e368..d3a55f4 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -110,6 +110,7 @@ global:
ndctl_cmd_smart_threshold_get_alarm_control;
ndctl_cmd_smart_threshold_get_temperature;
ndctl_cmd_smart_threshold_get_spares;
+ ndctl_cmd_smart_get_detail;
ndctl_dimm_zero_labels;
ndctl_dimm_get_available_labels;
ndctl_region_get_first;
diff --git a/ndctl/libndctl.h.in b/ndctl/libndctl.h.in
index c27581d..d215c48 100644
--- a/ndctl/libndctl.h.in
+++ b/ndctl/libndctl.h.in
@@ -280,6 +280,7 @@ struct ndctl_cmd *ndctl_dimm_cmd_new_smart_threshold(struct ndctl_dimm *dimm);
unsigned int ndctl_cmd_smart_threshold_get_alarm_control(struct ndctl_cmd *cmd);
unsigned int ndctl_cmd_smart_threshold_get_temperature(struct ndctl_cmd *cmd);
unsigned int ndctl_cmd_smart_threshold_get_spares(struct ndctl_cmd *cmd);
+unsigned int ndctl_cmd_smart_get_detail(struct ndctl_cmd *cmd);
#else
static inline struct ndctl_cmd *ndctl_dimm_cmd_new_smart(struct ndctl_dimm *dimm)
{
@@ -341,6 +342,10 @@ static inline unsigned int ndctl_cmd_smart_threshold_get_spares(
{
return 0;
}
+static inline unsigned int ndctl_cmd_smart_get_detail(struct ndctl_cmd *cmd)
+{
+ return 0;
+}
#endif
struct ndctl_cmd *ndctl_dimm_cmd_new_vendor_specific(struct ndctl_dimm *dimm,
diff --git a/ndctl/ndctl.h b/ndctl/ndctl.h
index 3b1d703..0bdf96f 100644
--- a/ndctl/ndctl.h
+++ b/ndctl/ndctl.h
@@ -28,6 +28,7 @@ struct nd_cmd_smart {
#define ND_SMART_ALARM_VALID (1 << 9)
#define ND_SMART_SHUTDOWN_VALID (1 << 10)
#define ND_SMART_VENDOR_VALID (1 << 11)
+#define ND_SMART_DETAIL_VALID (1 << 13)
#define ND_SMART_SPARE_TRIP (1 << 0)
#define ND_SMART_TEMP_TRIP (1 << 1)
#define ND_SMART_CTEMP_TRIP (1 << 2)
diff --git a/ndctl/util/json-smart.c b/ndctl/util/json-smart.c
index 94519da..304a66a 100644
--- a/ndctl/util/json-smart.c
+++ b/ndctl/util/json-smart.c
@@ -10,6 +10,7 @@
#else
#include <ndctl.h>
#endif
+#include "lib/ndctl-hpe1.h"
static double parse_smart_temperature(unsigned int temp)
{
@@ -151,6 +152,51 @@ struct json_object *util_dimm_health_to_json(struct ndctl_dimm *dimm)
json_object_object_add(jhealth, "shutdown_state", jobj);
}
+#define json_detail(jobj,jstring,detail,bit,string) \
+{ \
+ if (detail & bit) { \
+ jstring = json_object_new_string(string); \
+ if (jstring) \
+ json_object_array_add(jobj,jstring); \
+ } \
+}
+
+ if (flags & ND_SMART_DETAIL_VALID) {
+ unsigned int detail = ndctl_cmd_smart_get_detail(cmd);
+ if (detail) {
+ jobj = json_object_new_array();
+ json_object *jstring = NULL;
+
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_ES_FAILURE, "energy source error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_CTLR_FAILURE, "controller error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_UE_TRIP, "UC ECC error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_CE_TRIP, "CE trip")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_SAVE_FAILED, "save error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_RESTORE_FAILED, "restore error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_ARM_FAILED, "arm error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_ERASE_FAILED, "erase error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_CONFIG_ERROR, "configuration error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_FW_ERROR, "firmware error")
+ json_detail(jobj, jstring, detail,
+ NDN_HPE1_SMART_VENDOR_ERROR, "vendor specific error")
+ }
+ else
+ jobj = json_object_new_string("ok");
+ if (jobj)
+ json_object_object_add(jhealth, "health_status_detail",
+ jobj);
+ }
+
ndctl_cmd_unref(cmd);
return jhealth;
err:
--
1.8.3.1
5 years, 1 month
[PATCH v4 1/5] libnvdimm: Add mechanism to publish badblocks at nd region level
by Dave Jiang
badblocks sysfs file will be export at nd_region level. When nvdimm event
notifier happens for NVDIMM_REVALIATE_POISON, the badblocks in the nd
region will be updated. This provides a way for user apps to find out
where the badblocks are for the ND region so apps that are using device
dax can request to clear the poison.
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn(a)suse.de>
---
drivers/nvdimm/nd.h | 1 +
drivers/nvdimm/region.c | 24 ++++++++++++++++++++++++
drivers/nvdimm/region_devs.c | 19 +++++++++++++++++++
3 files changed, 44 insertions(+)
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 2a99c83..c3b33cf 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -154,6 +154,7 @@ struct nd_region {
u64 ndr_start;
int id, num_lanes, ro, numa_node;
void *provider_data;
+ struct badblocks bb;
struct nd_interleave_set *nd_set;
struct nd_percpu_lane __percpu *lane;
struct nd_mapping mapping[0];
diff --git a/drivers/nvdimm/region.c b/drivers/nvdimm/region.c
index 8f24177..869a886 100644
--- a/drivers/nvdimm/region.c
+++ b/drivers/nvdimm/region.c
@@ -14,6 +14,7 @@
#include <linux/module.h>
#include <linux/device.h>
#include <linux/nd.h>
+#include "nd-core.h"
#include "nd.h"
static int nd_region_probe(struct device *dev)
@@ -52,6 +53,17 @@ static int nd_region_probe(struct device *dev)
if (rc && err && rc == err)
return -ENODEV;
+ if (is_nd_pmem(&nd_region->dev)) {
+ struct resource ndr_res;
+
+ if (devm_init_badblocks(dev, &nd_region->bb))
+ return -ENODEV;
+ ndr_res.start = nd_region->ndr_start;
+ ndr_res.end = nd_region->ndr_start + nd_region->ndr_size - 1;
+ nvdimm_badblocks_populate(nd_region,
+ &nd_region->bb, &ndr_res);
+ }
+
nd_region->btt_seed = nd_btt_create(nd_region);
nd_region->pfn_seed = nd_pfn_create(nd_region);
nd_region->dax_seed = nd_dax_create(nd_region);
@@ -104,6 +116,18 @@ static int child_notify(struct device *dev, void *data)
static void nd_region_notify(struct device *dev, enum nvdimm_event event)
{
+ if (event == NVDIMM_REVALIDATE_POISON) {
+ struct nd_region *nd_region = to_nd_region(dev);
+ struct resource res;
+
+ if (is_nd_pmem(&nd_region->dev)) {
+ res.start = nd_region->ndr_start;
+ res.end = nd_region->ndr_start +
+ nd_region->ndr_size - 1;
+ nvdimm_badblocks_populate(nd_region,
+ &nd_region->bb, &res);
+ }
+ }
device_for_each_child(dev, &event, child_notify);
}
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index b7cb506..3500fc8 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -448,6 +448,21 @@ static ssize_t read_only_store(struct device *dev,
}
static DEVICE_ATTR_RW(read_only);
+static ssize_t nd_badblocks_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct nd_region *nd_region = to_nd_region(dev);
+
+ return badblocks_show(&nd_region->bb, buf, 0);
+}
+static struct device_attribute dev_attr_nd_badblocks = {
+ .attr = {
+ .name = "badblocks",
+ .mode = S_IRUGO
+ },
+ .show = nd_badblocks_show,
+};
+
static struct attribute *nd_region_attributes[] = {
&dev_attr_size.attr,
&dev_attr_nstype.attr,
@@ -460,6 +475,7 @@ static struct attribute *nd_region_attributes[] = {
&dev_attr_available_size.attr,
&dev_attr_namespace_seed.attr,
&dev_attr_init_namespaces.attr,
+ &dev_attr_nd_badblocks.attr,
NULL,
};
@@ -476,6 +492,9 @@ static umode_t region_visible(struct kobject *kobj, struct attribute *a, int n)
if (!is_nd_pmem(dev) && a == &dev_attr_dax_seed.attr)
return 0;
+ if (!is_nd_pmem(dev) && a == &dev_attr_nd_badblocks.attr)
+ return 0;
+
if (a != &dev_attr_set_cookie.attr
&& a != &dev_attr_available_size.attr)
return a->mode;
5 years, 1 month
[ndctl PATCH v3 0/7] Add ndctl check-namespace
by Vishal Verma
Changes in v3:
- Move the addition of ccan/bitmap to its own patch(es) (Dan)
- Drop the changelog update from the spec (Dan)
- Fix the [verse] section in the documentation text for check-namespace (Dan)
- Unify all namespace_disable paths to perform checking for a mounted
filesystem (Dan)
- Change the logging to use util/log.h (Dan)
- Use BTT_START_OFFSET for the initial offset, and store it in bttc (Jeff, Dan)
- Fix a number of line > 80 chars (everything but strings) (Jeff)
- Fix short write error handling, add fsync (Jeff)
- Save system page size in bttc to avoid calling sysconf repeatedly (Jeff)
- In check_log_map(), loop through the entire log even in case of an error,
and if there was a saved error, fail. (Jeff)
- btt-check.sh: in the post repair test, validate that the data read back
is the same as what was written (Jeff)
- Stop playing games with pre-adding/subtracting the initial 4K offset (Jeff)
- btt_read_info doesn't need to use 'rc', return directly.
Changes in v2:
- Move checking functionality to a separate file (Dan, Jeff)
- Rename btt-structs.h to check.h (Dan)
- Don't provide a configure option for building the checker, always
build it in. (Dan, Jeff)
- Fix the Documentation example to also include disable-namespace (Linda)
- Update the description text to note the namespace needs to be disabled
before checking (Linda)
- Use util/size.h for sizes (Dan)
- Use --repair to do repairs instead of --dry-run to disable repairs (Dan)
- Fix btt_read_info short read error handling (Jeff)
- Simplify the map lookup/write routines (Jeff)
- Differentiate the use off BTT_PG_SIZE, sysconf(_SC_PAGESIZE), and SZ_4K
(for the fixed start offset) in the different places they're used (Jeff)
- Add the missing msync when copying over info2 (Jeff)
- Add unit tests to test the checker (Jeff)
- Add a missing error case check in do_xaction_namespace for check
- Add a --force option that allows running on an active namespace (Jeff)
- Add a bitmap test for checking all internal blocks are referenced exactly
once between the map and flog (Jeff)
- Remove unused #defines in check.h
- Add comments to explain what we do with raw_mode (Jeff)
- Add some sanity checking when parsing an arena's metadata (Jeff)
- Refactor some read-verify sequences into a helper that combines the two (Jeff)
- Additional bounds checking on the 'offset' in recover_first_sb attempt 3 (Jeff)
- Add a missing ACTION_DESTROY string in parse_namespace_options (Dan)
- Use uXX, and cpu_to_XX from ccan/endian (Dan)
- Move the fletcher64 Routing to util/ as it is shared by builtin-dimm.c (Dan)
- Open the raw block device only once with O_EXCL instead of every time on
read/write/mmap (Dan)
- Add a new 'inform' routing in util/usage.c, and use it for some non-critical
messages (Dan)
- Remove namespace_is_offline() from builtin-check.c. Instead, use
util_namespace_active() from util/json.c
- Add a missing return value check after info block restoration in
discover_arenas
Vishal Verma (7):
libndctl: add a ndctl_namespace_is_active helper
libndctl: add a ndctl_namespace_disable_safe() API
ccan: Add ccan/bitmap in preparation for the BTT checker
ccan/bitmap: fix a set of gcc warnings (with -Wshadow)
ndctl: move the fletcher64 routine to util/
ndctl: add a BTT check utility
ndctl, test: Add a unit test for the BTT checker
Documentation/Makefile.am | 1 +
Documentation/ndctl-check-namespace.txt | 64 +++
Documentation/ndctl.txt | 1 +
Makefile.am | 7 +-
builtin.h | 1 +
ccan/bitmap/LICENSE | 1 +
ccan/bitmap/bitmap.c | 125 ++++
ccan/bitmap/bitmap.h | 243 ++++++++
contrib/ndctl | 3 +
licenses/LGPL-2.1 | 508 ++++++++++++++++
ndctl.spec.in | 2 +-
ndctl/Makefile.am | 1 +
ndctl/builtin-check.c | 988 ++++++++++++++++++++++++++++++++
ndctl/builtin-dimm.c | 18 +-
ndctl/builtin-list.c | 2 +-
ndctl/builtin-xaction-namespace.c | 112 ++--
ndctl/check.h | 127 ++++
ndctl/lib/libndctl.c | 59 ++
ndctl/lib/libndctl.sym | 2 +
ndctl/libndctl.h.in | 3 +
ndctl/ndctl.c | 1 +
test/Makefile.am | 5 +-
test/btt-check.sh | 170 ++++++
util/fletcher.c | 23 +
util/fletcher.h | 8 +
util/json.c | 17 +-
util/json.h | 1 -
util/util.h | 8 +
28 files changed, 2423 insertions(+), 78 deletions(-)
create mode 100644 Documentation/ndctl-check-namespace.txt
create mode 120000 ccan/bitmap/LICENSE
create mode 100644 ccan/bitmap/bitmap.c
create mode 100644 ccan/bitmap/bitmap.h
create mode 100644 licenses/LGPL-2.1
create mode 100644 ndctl/builtin-check.c
create mode 100644 ndctl/check.h
create mode 100755 test/btt-check.sh
create mode 100644 util/fletcher.c
create mode 100644 util/fletcher.h
--
2.9.3
5 years, 1 month
[PATCH v2] ndctl: Add package dependency information
by Linda Knippers
To help people avoid discovering the required packages
as a result of configure failures, add a reference to
BuildRequires infomation.
Signed-off-by: Linda Knippers <linda.knippers(a)hpe.com>
---
README.md | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/README.md b/README.md
index 38fc050..b4b3612 100644
--- a/README.md
+++ b/README.md
@@ -11,6 +11,10 @@ Build
`make check`
`sudo make install`
+There are a number of packages required for the build steps that
+may not be installed by default. For information about the required
+packages, look at the "BuildRequires:" lines in ndctl.spec.in.
+
Documentation
=============
See the latest documentation for the NVDIMM kernel sub-system here:
--
2.5.5
5 years, 1 month
[PATCH] ndctl: Add some package dependency information to README.md
by Linda Knippers
Rather than discover the required packages as a result
of configure failures, provide some hints in advance.
Signed-off-by: Linda Knippers <linda.knippers(a)hpe.com>
---
README.md | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/README.md b/README.md
index 38fc050..e31fab4 100644
--- a/README.md
+++ b/README.md
@@ -11,6 +11,12 @@ Build
`make check`
`sudo make install`
+There are a number of packages required for the build steps that may not be installed by default. These will be found during the first two steps.
+
+While not a complete list, the packages required to build on Fedora include:
+autogen, libtool, asciidoc, xmlto, kmod-devel, systemd-devel,libuuid-devel, json-c-devel
+The OpenSUSE package list is similar.
+
Documentation
=============
See the latest documentation for the NVDIMM kernel sub-system here:
--
2.5.5
5 years, 1 month
[PATCH] test: add fio test for device-dax
by Dan Williams
Jeff found that device-dax was broken with respect to falling back to
smaller fault granularities. Now that the fixes are upstream ([1], [2]),
add a test to backstop against future regressions and validate
backports.
Note that kernels without device-dax pud-mapping support will always
fail the alignment == 1GiB test. Kernels with the broken fallback
handling will fail the first alignment == 4KiB test.
The test requires an fio binary with support for the "dev-dax" ioengine.
[1]: commit 70b085b06c45 ("device-dax: fix pud fault fallback handling")
[2]: commit 0134ed4fb9e7 ("device-dax: fix pmd/pte fault fallback handling")
Cc: Dave Jiang <dave.jiang(a)intel.com>
Reported-by: Jeff Moyer <jmoyer(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
test/Makefile.am | 1 +
test/device-dax-fio.sh | 74 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 75 insertions(+)
create mode 100755 test/device-dax-fio.sh
diff --git a/test/Makefile.am b/test/Makefile.am
index 98f444231306..969fe055b35e 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -26,6 +26,7 @@ TESTS +=\
dax-dev \
dax.sh \
device-dax \
+ device-dax-fio.sh \
mmap.sh
check_PROGRAMS +=\
diff --git a/test/device-dax-fio.sh b/test/device-dax-fio.sh
new file mode 100755
index 000000000000..ab620b67027f
--- /dev/null
+++ b/test/device-dax-fio.sh
@@ -0,0 +1,74 @@
+#!/bin/bash
+NDCTL="../ndctl/ndctl"
+rc=77
+
+set -e
+
+err() {
+ echo "test/device-dax-fio.sh: failed at line $1"
+ exit $rc
+}
+
+check_min_kver()
+{
+ local ver="$1"
+ : "${KVER:=$(uname -r)}"
+
+ [ -n "$ver" ] || return 1
+ [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]]
+}
+
+check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; }
+
+set -e
+trap 'err $LINENO' ERR
+
+if ! fio --enghelp | grep -q "dev-dax"; then
+ echo "fio lacks dev-dax engine"
+ exit 77
+fi
+
+dev=$(./dax-dev)
+for align in 4k 2m 1g
+do
+ json=$($NDCTL create-namespace -m dax -a $align -f -e $dev)
+ chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev")
+ if [ align = "1g" ]; then
+ bs="1g"
+ else
+ bs="2m"
+ fi
+
+ cat > fio.job <<- EOF
+ [global]
+ ioengine=dev-dax
+ direct=0
+ filename=/dev/${chardev}
+ verify=crc32c
+ bs=${bs}
+
+ [write]
+ rw=write
+ runtime=5
+
+ [read]
+ stonewall
+ rw=read
+ runtime=5
+ EOF
+
+ rc=1
+ fio fio.job 2>&1 | tee fio.log
+
+ if grep -q "fio.*got signal" fio.log; then
+ echo "test/device-dax-fio.sh: failed with align: $align"
+ exit 1
+ fi
+
+ # revert namespace to raw mode
+ json=$($NDCTL create-namespace -m raw -f -e $dev)
+ mode=$(echo $json | jq -r ".mode")
+ [ $mode != "memory" ] && echo "fail: $LINENO" && exit 1
+done
+
+exit 0
5 years, 1 month
[PATCH v3 0/3] apci, nfit: DSM improvements
by Linda Knippers
The first two patches in this series are motivated by feedback
from distribution developers, test engineers, and management tool
developers. We need the ability to test and support Linux and
management tools on systems that support more than one DSM family
for NVDIMM-N.
We also need the ability to test and support new functions without
requiring users to update their kernel. These changes will also
facilate development as we move toward DSM standardization.
The ability to restrict DSM functions to the currently documented
set is still in place.
The third patch cleans up a cosmetic/modinfo parsing issue with one
of the existing module parameters.
Changes in v3:
- Simplified dsm address family determination based on Dan's feedback.
Changes in v2:
- Switched from allowing all functions advertised by the
firmware through function 0 to an override module parameter.
Linda Knippers (3):
Allow override of built-in bitmasks for NVDIMM DSMs
Allow specifying a default DSM family
Remove unnecessary newline
drivers/acpi/nfit/core.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
--
1.8.3.1
5 years, 1 month