[Lustre-discuss] fsck of OST problems - endless loop restarting pass 1

Andreas Dilger adilger at sun.com
Wed Dec 2 14:27:32 PST 2009


On 2009-12-02, at 11:51, Craig Prescott wrote:
>> You may want to disable the group descriptor checksums with:
>>
>> debugfs -R "feature ^uninit_bg" {dev}
>>
>> and then retry the mount and/or e2fsck.  This feature is making it  
>> more
>> difficult to use the backup descriptors for some reason.
>
> The debugfs command didn't take - uninit_bg still showed up in
> "filesystem features" if I ran 'stats' under debugfs interactively.
>
> But 'tune2fs -O ^uninit_bg /dev/F3P1L0/T2-F3P1L0' did work.
>
> Unfortunately, mounting the device as ldiskfs still didn't work; from
> the syslog:
>
> LDISKFS-fs error (device dm-7): ldiskfs_check_descriptors: Checksum  
> for group 0 failed (0!=29388)
>
> LDISKFS-fs: group descriptors corrupted!
>
> Note that the group descriptor checksum inequality message in the  
> syslog is changed - (0!=29388) is what we get now, versus (18306!=0)  
> when group descriptor checksums were enabled.
>
> I still haven't had any luck with fsck.
>
> Do you have any other ideas?


Hmm, the code shouldn't be checking the checksums if the uninit_bg
feature is not enabled.  I believe this was fixed in ext4 already:

in ldiskfs_group_desc_csum_verify() change it to be:

int ldiskfs_group_desc_csum_verify(struct ext4_sb_info *sbi,
                                    __u32 block_group,
                                    struct ext4_group_desc *gdp)
{
         if ((sbi->s_es->s_feature_ro_compat &
              cpu_to_le32(LDISKFS_FEATURE_RO_COMPAT_GDT_CSUM)) &&
             (gdp->bg_checksum != ldiskfs_group_desc_csum(sbi,  
block_group, gdp)))
                 return 0;
         return 1;
}

This should allow you to mount the filesystem.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list