[lustre-discuss] recovery MDT ".." directory entries (LU-5626)

Martin Hecht hecht at hlrs.de
Wed Nov 4 01:42:46 PST 2015


On 11/04/2015 03:23 AM, Patrick Farrell wrote:
> PAF: Remember, the specific conditions are pretty tight.  Created under 1.8, not empty (if it's empty, the .. dentry is not misplaced when moved) but also non-htree, then moved with dirdata enabled, and then grown to this larger size.  How many existing (small) directories do you move and then add a bunch of files to?  It's a pretty rare operation.  We only hit it at Martin's site because of an automated tool they have to re-arrange user/job directories.
Well, not only because of the tool. Especially, because when the
directories have been moved by the tool, no files are added anymore.
However, our mechanism gives a reason to the users to move their data
from time to time (that's not the intention of the mechanism, but that's
how some users react).

But I'm not quite sure anymore if moving the directories is really a
precondition to run into LU-5626.
We have run the background lfsck which adds the FID to the existing
dentries. This might be an important detail, because in our case a
second '..' entry containing the FID was presumably created by lfsck (in
the wrong place), and not by moving the directory. To my current
understanding the user then only has to add some files to trigger the LBUG.
A subsequent e2fsck will not only find this particular directory but all
other small directories with a '..' entry in the wrong place. When
e2fsck tries to fix these directories, some entries are overwritten by
the FID and these files are then moved to lost+found.
If one of these first entries happens to be a small subdirectory, I
believe there is a chance to run into the same issue again, when you
move everything back to the original location after the e2fsck and
someone starts adding files in these subdirectories.

However, the preconditions are still quite narrow: small directories,
not empty, created without fid, then converted by lfsck (or
alternatively moved to a different place which would also create the
second '..' entry). To trigger the LBUG files need to be added to one of
these directories and for a second occurrence of the LBUG the same
conditions must hold for another subdirectory which must have been at
the very beginning of the directory.

Martin


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2252 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20151104/a1b5a4a3/attachment.bin>


More information about the lustre-discuss mailing list