We have experienced an LBUG on one of ours OSS servers which I have not
seen before.
LustreError: 16066:0:(filter_io_26.c:344:filter_do_bio()) ASSERTION(rw ==
OBD_BRW_READ) failed
LustreError: 16066:0:(filter_io_26.c:344:filter_do_bio()) LBUG
Pid: 16066, comm: ll_ost_io_126
Now after rebooting that OSS same LBUG is triggered as soon as OSTs
finishes recovery and start servicing their data. Has anyone seen this
before ?
Our environment:
servers: RHEL6 2.6.32-220.17.1.el6_lustre.x86_64 Lustre-2.1.2
clients: RHEL6 2.6.32-358.6.2.el6.x86_64 Lustre-2.1.5 patchless
We have resolved this by first identifying which OST was involved in the
LBUG and then running fsck on that OST. File system check found an
incorrect inode number and repaired it. This is very worrying though how
this corruption has crept in onto the filesystem. We did not experienced
any hardware problems or unexpected crashes of the storage that could cause
this.
--
Wojciech Turek
Show replies by date