On 2013-07-12, at 15:01, "Vicker, Darby (JSC-EG311)"
I've got a question about locating files on a given OST. First
We recently had 1 hard drive failure in conjunction with 2 other drives developing SMART
errors in short order on one of our OSS's. This OSS is running hardware RAID6 so we
didn't lose the raid array or the OST's (there are 4 OST's on this OSS) but
the controller is now reporting that there are some bad stripes on the array. The only
way to clear the bad stripes is to reformat the array so we are now in the process of
migrating the data off of these OST's. So we got a list of the files on those
lfs find /lustre/ -obd hpfs-eg3-OST002c > oss11ost1
lfs find /lustre/ -obd hpfs-eg3-OST002d > oss11ost2
lfs find /lustre/ -obd hpfs-eg3-OST002e > oss11ost3
lfs find /lustre/ -obd hpfs-eg3-OST002f > oss11ost4
You can specify multiple OSTs at one time to "lfs find" so it only needs to make
a single pass through the filesystem.
And now we're using lfs_migrate to move the files to other
OST's. The migration is ongoing and, fortunately, the data corruption due to the bad
stripes seems to be minimal. The intention is that once we think the migration is done,
to repeat the lfs find commands to verify the OST's are indeed empty. My question: is
there a faster way to get the file list? We ran those lfs find commands in parallel but
it took over 12 hours for them to complete and found about 28 million files. The MDT is
build on top of LVM so I was imagining something along the lines of taking an LVM
snapshot, mounting that read only and extracting the info. But I know nothing about the
file system structure so this may not even be possible. Any advice would be appreciated -
either about this approach or something else that would be faster.
What you describe is also possible, and I think some of the large Lustre sites have tools
to do this. It is also possible (with Lustre 2.x) to use a tool like RobinHood that
maintains an index of the filesystem continuously using the Lustre ChangeLog.