[Lustre-discuss] OST crash with group descriptors corrupted
Brian J. Murrell
Brian.Murrell at Sun.COM
Mon Mar 9 11:13:15 PDT 2009
On Mon, 2009-03-09 at 19:39 +0800, thhsieh wrote:
> Dear All,
>
> We have an emergent condition on the Lustre filesystem.
>
> But today
> we encounter the disk array hardware problem (one of the hard disk
> of the disk array RAID 6 crashed), and soon after that the lustre
> filesystem got crashed, too.
> The dmesg message shows:
>
> [ 3314.530762] LDISKFS-fs error (device sdb1): ldiskfs_check_descriptors: Block bitmap for group 11152 not in group (block 3407085568)!
> [ 3314.531701] LDISKFS-fs: group descriptors corrupted!
It looks like your disk error has resulted on an on-disk corruption.
AFAIK, RAID is supposed to prevent this. No idea why it didn't in this
case. Maybe check with your RAID vendor.
> It seems that the backend ext3 file system is still there, but has
> errors.
Indeed.
> Could anyone suggest me a way to recover the OST partitions? Can I use
> e2fsck to fix the problems of the OST partitions?
Yes, e2fsck should correct the problem(s). Be aware that there is a
possibility that the only way for e2fsck to correct the state of the
filesystem is to (re-)move data from the filesystem. To what extent,
will depend completely on how much on-disk corruption has taken place.
You can get an idea of what e2fsck will do without actually doing
anything to the disk data by giving it the "-n" argument. You can
decide based on that "dry-run" e2fsck output whether the corrective
action it will take is acceptable to you.
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090309/f8818307/attachment.pgp>
More information about the lustre-discuss
mailing list