[Lustre-discuss] recover borked mds

Andreas Dilger adilger at sun.com
Wed Aug 19 20:04:46 PDT 2009


On Aug 19, 2009  12:57 -0400, Brock Palen wrote:
> After a network event (switches bouncing) looks like our mds got  
> borked somewhere, from all the random failovers (switches came up and  
> down rapidly over a few hours).
> 
> Now we can not mount the mds,  when we do we get the following errors:
>
> Aug 19 12:37:43 mds2 kernel: LustreError: 7525:0:(llog_lvfs.c: 
> 612:llog_lvfs_create()) error looking up logfile 0xf150010:0x80d24629:  
> rc -2

Looks like a problem initializing the orphan object cleanup log.
This shouldn't be fatal, in that a new log file count be created.
But, it dutifully reports the error up the (long) chain and fails
the mount.

> We have ran e2fsck on the volume, found a few errors and corrected.   
> But the problem presists.  We also tried mounting with -o abort_recov   
> this resulted in a assertion (lbug) and does not work.

That shouldn't happen either...

> ANy thoughts?  The lines:

You can delete the "CATALOGS" file on the MDS, it should start up OK.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list