andreas.dilger at intel.com
Wed Jan 24 13:29:17 PST 2018
On Jan 24, 2018, at 13:59, E.S. Rosenberg <esr at cs.huji.ac.il> wrote:
> On Wed, Jan 24, 2018 at 1:05 PM, Dilger, Andreas <andreas.dilger at intel.com> wrote:
>> On Jan 22, 2018, at 19:03, E.S. Rosenberg <esr+lustre at mail.hebrew.edu> wrote:
>> > Dragging the old discussion back up....
>> > First of thanks for all the replies last time!
>> > Last time in the end we didn't need to recover but now another user made
>> > a bigger mistake and we do need to recover data.
>> Sounds like it is time for backups and/or snapshots to avoid these issues in the future. If you don't have space for a full filesystem backup, doing daily backups of the MDT simplifies such data recovery significantly. Preferably the backup is done from a snapshot using "dd" or "e2image", but even without a snapshot it is better to do a backup from the raw device than not at all.
> Yeah I think our next Lustre system is going to be ZFS based so we should have at least 1 snapshot at all times (more then that will probably be too expensive).
> OTOH this whole saga is also excellent user education that will genuinely drive the point home that they should only store reproducible data on Lustre which as defined by us is scratch and not backed up.
>> > I have shut down our Lustre filesystem and am going to do some simulations
>> > on a test system trying various undelete tools.
>> > autopsy (sleuthkit) on the metadata shows that at least the structure is
>> > still there and hopefully we'll be able to recover more.
>> You will need to be able to recover the file layout from the deleted MDT inode (which existing ext4 recovery tools might help with), including the "lov" xattr, which is typically stored inside the inode itself unless the file was widely striped.
>> Secondly, you will also need to recover the matching OST inodes/objects that were deleted. There may be deleted entries in the OST object directories (O/0/d*/) that tell you which inodes the objects were using. Failing that, you may be able to tell from the "fid" xattr of deleted inodes which object they were. Using the Lustre debugfs "stat <inode>" command may help on the OST.
>> You would need to undelete all of the objects in a multi-stripe file for that to be very useful.
>> > Has anyone ever done true recovery of Lustre or is it all just theoretical
>> > knowledge at the moment?
>> > What are the consequences of say undeleting data on OSTs that is then not
>> > referenced on the MDS? Could I cause corruption of the whole filesystem by
>> > doing stuff like that?
>> As long as you are not corrupting the actual OST or MDT filesystems by undeleting an inode whose blocks were reallocated to another file, it won't harm things. At worst it would mean OST objects that are not reachable through the MDT namespace. Running an lfsck namespace scan (2.7+) would link such OST objects into the $MOUNT/.lustre/lost+found directory if they are not referenced from any MDT inode.
>> > (As far as the files themselves go they are most likely all single striped
>> > since that is our default and we are pre PFL so that should be easier I
>> > think).
>> That definitely simplifies things significantly.
> Some of what I wrote before was due to my hope to do in-place recovery and make stuff 'visible' again on lustre.
> I actually ran into a different interesting issue, it seems extundelete balks at huge ext filesystems (33T) it considers some of the superblock values to be out-of-domain (a quick look at the source suggests to me that they assumed INT32, but 32T is also the limit of ext3).
This seems like it wouldn't be too hard for you to fix?
> ext4magic returns the error 2133571465 from e2fsprogs which according to the source maps to EXT2_ET_CANT_USE_LEGACY_BITMAPS not sure what to make of that.
> and bringing up the rear is ext3grep which doesn't know xattrs and therefor stops.
It would be possible for you to update these tools to support the new features.
Look at the e2fsprogs git history for when EXT2_ET_CANT_USE_LEGACY_BITMAPS was
added and IIRC it needs to add a flag to the ext2fs_open() code, and possibly
some use of wrapper functions if it is accessing bitmaps.
Lustre Principal Architect
More information about the lustre-discuss