[lustre-devel] LDISKFS-fs error: osd_iget: special inode unallocated, Remounting filesystem read-only

Tue Nov 28 12:03:47 PST 2023

I would suggest to patch the ext4 iget() to print the requested inode number, and lu_object_find_at() to print the FID. 

I suspect that the fix would be to make lu_object_find_at() just handle the -EFSCORRUPTED error like -ENOENT, and consider the FID bad.

The iget() error needs to be avoided as well (ideally with flags instead of a patch), so the bad inode lookup doesn't cause the filesystem to go read-only.  

AFAIK, this is "legal" for knfsd to do inode lookups with bad inode numbers, so possibly we need to filter "special" inode numbers in osd-ldiskfs, except root, to avoid the error?  Knowing which inode number is being accessed would help here. 

Cheers, Andreas

> On Nov 28, 2023, at 12:47, Cyrus Ramavarapu via lustre-devel <lustre-devel at lists.lustre.org> wrote:
> 
> Hello,
> 
> I have recently started seeing sanity-lfsck failures in tests 18g, 23b, and 23c on Ubuntu 20.04 5.15.0-1051-azure due to the MDT filesystem going readonly preventing either the start of LFSCK or LFSCK operations. In all cases logs on the MDS show the following:
> 
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs error (device dm-0): osd_iget:500: inode #195: comm mdt03_003: iget: special inode unallocated Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: Aborting journal on device dm-0-8.
> Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LustreError: 29024:0:(osd_handler.c:1787:osd_trans_commit_cb()) transaction @0x00000000a2d278af commit error: 2 Nov 20 20:06:59 e72f0907-59ba-4ffd-9528-2e3ad47050e4-mdsmgs-a0-vm kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only
> 
> LFSCK operations if they start will fail with error code 117 (EFSCORRUPTED):
> 
> 00000020:00000001:8.0:1700166322.660540:0:43212:0:(lu_object.c:908:lu_object_find_at()) Process leaving (rc=18446744073709551499 : -117 : ffffffffffffff8b)
> 00100000:00000001:8.0:1700166322.660541:0:43212:0:(lfsck_layout.c:3241:lfsck_layout_scan_orphan_one()) Process leaving via out (rc=18446744073709551499 : -117 : 0xffffffffffffff8b)
> 
> In both cases, the error comes from an ldiskfs_iget operation which passes the LDISKFS_IGET_SPECIAL flag to __ext4_iget. A recent ext4 patch started checking for this flag and will return EFSCORRUPTED if the inode is unallocated (https://lkml.kernel.org/stable/20230320145452.175177331@linuxfoundation.org/ ).
> 
> Adding LDISKFS_IGET_SPECIAL always to ldiskfs_iget was done as part of LU-13166 (https://review.whamcloud.com/c/fs/lustre-release/+/37421 ) and feels broad to me in the context of the upstream ext4 change. At the moment I am investigating removing the LDISKFS_IGET_SPECIAL flag from ldiskfs_iget to see how it impacts the LFSCK tests and to determine if a more targeted change can be made to satisfy the intent of LU-13166.
> 
> Any suggestions or thoughts on how to approach this problem would be greatly appreciated. Additional logs or debugging information can be provided if needed.
> 
> Thank you and best,
> Cyrus Ramavarapu
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org