[Lustre-discuss] Failed OST Cleanup

Scott Barber scott at imemories.com
Wed Jun 2 10:54:50 PDT 2010


[Lustre v1.8.3 Servers on CentOS 5.4]

We had a piece of hardware fail and it killed all data on an OST. On
the MDS I ran:
lctl --device 25 deactivate
lctl conf_param sanvol06-OST0013.osc.active=0

Lustre is back up and running and now we're in cleanup mode.

I'm now trying to get a list of files that are now corrupt. On one of
the lustre clients I'm running:
lfs find --obd sanvol06-OST0013_UUID  <my lustre mount point>

It starts to list files and then a few minutes later it runs into an
error and stops:
cb_find_init: IOC_LOV_GETINFO on <filename> failed: Input/output error.

In dmesg I see:
LustreError: 13926:0:(file.c:1053:ll_glimpse_size()) obd_enqueue
returned rc -5, returning -EIO

The file that gets that "Input/output error" cannot be delete or
removed from the file system. How can I get around this?

At the end of the day I need to get a list of a files that were on the
bad OST and then be able to remove them.

Thanks for your help,
Scott Barber
iMemories.com
Senior Systems Administrator



More information about the lustre-discuss mailing list