On Fri, May 25, 2018 at 1:49 AM, Qi, Fuli <qi.fuli(a)jp.fujitsu.com> wrote:
> As mentioned above this function seems to assume that the only
DIMM events to
> send are DIMM health events. It's ok to save other object monitoring to a later
patch,
> but let's at least support DIMM health
> events:
>
> dimm-spares-remaining
> dimm-media-temperature
> dimm-controller-temperature
> dimm-health-state
>
> ...and DIMM detected events:
>
> dimm-unclean-shutdown
> dimm-detected
>
> There should be an event type included in the json. Along with 'timestamp'
and 'pid'
> I think we need an 'event' field so that consumer code can make assumptions
about
> the format of the event record. I think 'dimm-health' and
'dimm-detect' are the only
> event record types we need to support in this initial version.
>
Hi, Dan
I would like to confirm whether my understanding of the feature in each dimm-event is
right or not.
a) dimm-spares-remaining
Checking the Spare Block Remaining Trip in Alarm Trips, if set then the notification
is dimm-spares-remaining.
Yes.
b) dimm-media-temperature
Checking the NVDIMM Media Temperature Trip in Alarm Trips, if set then the
notification is dimm-media-temperature.
Yes.
c) dimm-controller-temperature
Checking the NVDIMM Controller Temperature Trip in Alarm Trips, if set then the
notification is dimm-controller-temperature.
Yes.
d) dimm-health-state
Checking the Health Status, if changed then the notification is dimm-health-state.
Yes.
e) dimm-unclean-shutdown
Checking the Last Shutdown Status, if changed then the notification is
dimm-unclean-shutdown.
Yes.
f) dimm-detected
Checking the UUID of DIMM, if changed then the notification is dimm-detected.
No, this would fire for each DIMM detected when the daemon starts up,
and for any future DIMM that is hot plugged into the system. The
notification of hotplugged DIMM devices would be a uevent. We can save
this notification for later as it is different from the
'health_event_fd' that these other notifications are communicated.
Is there a possibility that a notification contains multiple
dimm-events?
Yes, we may only get 1 notification from the kernel, but all of these
items might have changed / tripped.
Should I need to turn off the event alarm after the notification logged?
No, if the kernel continues to send events then the monitor should log them.