Yasunori Goto <y-goto(a)jp.fujitsu.com> writes:
>> > Another approach could be to integrate NVDIMM event
>> > monitoring into some other utility, like the rasdaemon. I'm interested
>> > your thoughts.
>> Though I'm not sure which (existing or new) utility is appropriate yet.
>> I prefer this way. So, I'll think about it.
> I investigated the issue that notification/monitoring feature of over-
> threshold event with my co-worker. Here is current our understandings.
> a) rasdaemon
> It is good tools for machine check error, and if machine check occurs on
> NVDIMM, I suppose it will work not only traditional RAM but also NVDIMM.
> But, it may not fit the purpose of notification/monitoring threshold event.
> b) smartmontools (https://www.smartmontools.org/
> This tool may fit the purpose of notification/monitoring of health of NVDIMMs.
> However, it may a bit troublesome due to the followings.
> - The smartd seems to check smart values of each devices with
> ioctl() periodically (In other words, "polling").
> Probably, other devices does not have the
> notification interface like "ndctl_dimm_get_health_eventfd()
> and poll()/select()".
> - smartmontools supports many OSs (Windows, darwin, xxxBSDs, os2(!)).
> I'm not sure other OSs have similar notification interface like Linux.
> So, it may need to "polling" like other devices.
> c) udev
> Udev can kick any programs if udev.rules is created.
> However, there is no uevent for the event of over-threshold currently.
> In addition, I'm not sure that udev fits this type of event notification.
> d) make a new tiny daemon in ndctl tree
> This may be simpler way.
> It can use ndctl_dimm_get_health_eventfd() and poll()/select().
> But, ndctl may be included in kernel source,
> and I don't know whether kernel includes other daemon tools or not.
Except acpid is ACPI specific, and the event sources that libnvdimm
generates are generic. For example, we may be getting an Open Firmware
libnvdimm bus in the next merge window.