[Lustre-discuss] fsck of OST problems - endless loop restarting pass 1
Craig Prescott
prescott at hpc.ufl.edu
Wed Dec 2 16:16:35 PST 2009
Andreas Dilger wrote:
> Hmm, the code shouldn't be checking the checksums if the uninit_bg
> feature is not enabled. I believe this was fixed in ext4 already:
>
> in ldiskfs_group_desc_csum_verify() change it to be:
>
> int ldiskfs_group_desc_csum_verify(struct ext4_sb_info *sbi,
> __u32 block_group,
> struct ext4_group_desc *gdp)
> {
> if ((sbi->s_es->s_feature_ro_compat &
> cpu_to_le32(LDISKFS_FEATURE_RO_COMPAT_GDT_CSUM)) &&
> (gdp->bg_checksum != ldiskfs_group_desc_csum(sbi,
> block_group, gdp)))
> return 0;
> return 1;
> }
Ok, thanks. I'll try that.
Here's what the 1.8.1.1 ldiskfs_group_desc_csum_verify() looks like
(from lustre-ldiskfs-3.0.9/ldiskfs/super.c):
int ldiskfs_group_desc_csum_verify(struct ldiskfs_sb_info *sbi, __u32
block_group,
struct ldiskfs_group_desc *gdp)
{
return (gdp->bg_checksum ==
ldiskfs_group_desc_csum(sbi, block_group, gdp));
}
(this is following an 'rpmbuild -bc lustre-ldiskfs.spec' from
lustre-ldiskfs-3.0.9-2.6.18_128.7.1.el5_lustre.1.8.1.1.src.rpm).
The problematic OST is direct-attached to a running OSS with ldiskfs.ko
loaded (problematic OST is marked inactive). I'll have to wait at least
until tomorrow for an opportunity to try deploying and reloading an
updated ldiskfs.ko.
Again, I really appreciate the help, and will let the list know how it goes.
Thanks,
Craig Prescott
UF HPC Center
More information about the lustre-discuss
mailing list