[lustre-discuss] Data recovery with lost MDT data

Andreas Dilger adilger at whamcloud.com
Thu Sep 21 13:31:34 PDT 2023


In the absence of backups, you could try LFSCK to link all of the orphan OST objects into .lustre/lost+found (see lctl-lfsck_start.8 man page for details).

The data is still in the objects, and they should have UID/GID/PRJID assigned (if used) but they have no filenames.  It would be up to you to make e.g. per-user lost+found directories in their home directories and move the files where they could access them and see if they want to keep or delete the files.

How easy/hard this is to do depends on whether the files have any content that can help identify them.

There was a Lustre hackathon project to save the Lustre JobID in a "user.job" xattr on every object, exactly to help identify the provenance of files after the fact (regardless of whether there is corruption), but it only just landed to master and will be in 2.16. That is cold comfort, but would help in the future.

Cheers, Andreas

On Sep 20, 2023, at 15:34, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:


Hello,

We have recently accidentally deleted some of our MDT data.  I think its gone for good but looking for advice to see if there is any way to recover.  Thoughts appreciated.

We run two LFS’s on the same set of hardware.  We didn’t set out to do this, but it kind of evolved.  The original setup was only a single filesystem and was all ZFS – MDT and OST’s.  Eventually, we had some small file workflows that we wanted to get better performance on.  To address this, we stood up another filesystem on the same hardware and used a an ldiskfs MDT.  However, since were already using ZFS, under the hood the storage device we build the ldisk MDT on comes from ZFS.  That gets presented to the OS as /dev/zd0.  We do a nightly backup of the MDT by cloning the ZFS dataset (this creates /dev/zd16, for whatever reason), snapshot the clone, mount that as ldiskfs, tar up the data and then destroy the snapshot and clone.  Well, occasionally this process gets interrupted, leaving the ZFS snapshot and clone hanging around.  This is where things go south.  Something happens that swaps the clone with the primary dataset.  ZFS says you’re working with the primary but its really the clone, and via versa.  This happened about a year ago and we caught it, were able to “zfs promote” to swap them back and move on.  More details on the ZFS and this mailing list here.

https://zfsonlinux.topicbox.com/groups/zfs-discuss/Tcb8a3ef663db0031-M5a79e71768b20b2389efc4a4

http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2022-June/018154.html

It happened again earlier this week but we didn’t remember to check this and, in an effort to get the backups going again, destroyed what we thought were the snapshot and clone.  In reality, we destroyed the primary dataset.  Even more unfortunately, the stale “snapshot” was about 3 months old.  This stale snapshot was also preventing our MDT backups from running so we don’t have those to restore from either.  (I know, we need better monitoring and alerting on this, we learned that lesson the hard way.  We had it in place after the June 2022 incident, it just wasn’t working properly.)  So at the end of the day, the data lives on the OST’s we just can’t access it due to the lost metadata.  Is there any chance at data recovery.  I don’t think so but want to explore any options.

Darby

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20230921/76e176d5/attachment.htm>


More information about the lustre-discuss mailing list