I'm no LNET expert, but for debugging you can do:
lctl set_param debug=-1
to enable all Lustre debugging (CDEBUG), then later to dump the logs:
lctl dk /tmp/debug
Your error message should probably print the value as a signed number instead of unsigned,
since it is just a small negative number.
Cheers, Andreas
On Jan 30, 2014, at 8:35, "Tobias Groschup"
<groschup(a)stud.uni-heidelberg.de> wrote:
Hello,
I am still struggling with the GET message on the LND level. After adding a test to the
lnet selftest, there is one GET going through the LND, and after that nothing happens some
time, untill this error message is dumped to the console:
add test RPC failed on 12345-1@ex: Unknown error 18446744073709551506
Is there any way to find out what caused this error? That would be a great help in
finding what the LND does wrong.
I consulted the different log files like /var/log/dmesg and /var/log/messages. On my
system, there is no file log-lustre under /tmp. So, I do not know how to investigate this
error further. Any help on this matter would be very much appreciated!
Thanks and kind regards
Tobias Groschup
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss