[lustre-discuss] Attempting to recover zfs ost after file corruption
mohrrf at ornl.gov
Wed Mar 3 14:03:50 PST 2021
I have a file system running Lustre 2.10.4 on CentOS 7.5 with zfs 0.7.9 that I am attempting to keep functional until we can move data to a new Lustre file system. We recently had a couple of osts suffer from some data corruption, and after getting them imported and running a scrub, it seems the errors may be confined to two directories on the ost's underlying zfs file system: CONFIGS/ and oi.10/.
Is it possible to simply remove these files and have them automatically get rebuilt when the ost is remounted? My hope is that any files under CONFIGS/ would get repopulated when it connected to the mgs. But if needed, I can always extract files directly from the mgt. The one thing that I am not sure about is how to handle the oi.10/ directory.
I reviewed the procedure in the Lustre manual for restoring an ost from a file-level backup. Since it looks like all the user files are still intact, my thought was that I could avoid the actual file restoration step and just proceed with the steps to remove CATALOGS, oi.*, LFSCK, etc. The main difference is that since I am not reformatting the ost, I wouldn't be able to add the "--replace" flag which sounds like it is used to trigger some of the recovery steps.
Any help is greatly appreciated.
More information about the lustre-discuss