[lustre-discuss] corrupt FID on zfs?
Dilger, Andreas
andreas.dilger at intel.com
Mon Apr 9 18:49:33 PDT 2018
On Apr 9, 2018, at 02:10, Stu Midgley <sdm900 at gmail.com> wrote:
>
> Afternoon
>
> We have copied off all the files from an OST (lfs find identifies no files on the OST) but the OST still has some left over files
>
> eg.
>
> 9.6G O/0/d22/1277942
>
> when I get the FID of this file using zfsobj2fid it appears to get a corrupt FID
>
> [0x200000a48:0x1e86e:0x1]
>
> which then returns
>
> bad FID format '[0x200000a48:0x1e86e:0x1]', should be [seq:oid:ver] (e.g. [0x200000400:0x2:0x0])
>
> fid2path: error on FID [0x200000a48:0x1e86e:0x1]: Invalid argument
>
> when I check it with lfs fid2path
Try it with the last field as 0x0, like "[0x200000a48:0x1e86e:0x0]".
On the OST, we use the last field to store the stripe index for the file,
so that LFSCK can reconstruct the file layout even if the MDT inode is
corrupted.
> WTF?
>
> Checking a few OST's this isn't isolated. I've seen a few different corruptions eg.
>
> [0x200000a48:0x1e86e:0x7]
> [0x200000a48:0x1e684:0x3]
>
>
> Extra, quite a file files under the O/0/ directory didn't have trusted.fid set... which seemed strange.
That is not unusual, since the parent (MDT inode) FID is only stored into the
object if it is modified by a client, or if an LFSCK layout check is run.
> So a few questions.
> How did the FID type get corrupt?
> How did this file get orphaned?
>
> I had to modify zfsobj2fid to work with a mounted snapshot of the ZFS volume
>
> # diff ../zfsobj2fid /sbin/zfsobj2fid
> 38c38
> < p = subprocess.Popen(["zdb", "-O", "-vvv", sys.argv[1], sys.argv[2]],
> ---
> > p = subprocess.Popen(["zdb", "-e", "-vvv", sys.argv[1], sys.argv[2]],
It would be great if you could submit this as a patch to Gerrit.
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation
More information about the lustre-discuss
mailing list