[Lustre-discuss] Lustre Mount Crashing

Andreas Dilger adilger at sun.com
Tue Jun 3 13:20:05 PDT 2008


On Jun 02, 2008  19:51 -0400, Charles Taylor wrote:
> Wow, you are one powerful witch doctor.     So we rebuilt our system disk 
> (just to be sure) and that made no difference we still panicked as soon as 
> mounted the MDT.   The "-o abort_recov" did not help either.   However, 
> your recipe below worked wonders....almost.     Now we can mount the MDT 
> but it does not go into recovery.     It just shows as "inactive".     We 
> are so close, I can taste it but what are we doing wrong now?
>
>
> [root at hpcmds lustre]# cat /proc/fs/lustre/mds/ufhpc-MDT0000/recovery_status
> status: INACTIVE
>
>
> Which tire do we kick now?   :)

Well, deleting the tail of the last_rcvd file is the "hard" way to tell
the MDT/OST it is no longer in recovery...  The deleted part of the file
is where the per-client state is kept, so when it is removed the MDT
decides no recovery is needed.

The "recovery_status" being "INACTIVE" is somewhat misleading.  It means
"no recovery is currently active", but the MDT is up and you should be
able to use it, with the caveat that clients previously doing operations
will get an IO error for in-flight operations before they start afresh...
However, you said the clients are powered off, so they probably aren't
busy doing anything...

If you had a more complete stack trace it would be useful to determine
what is actually going wrong with the mount.

> On Jun 2, 2008, at 3:36 PM, Andreas Dilger wrote:
>> If mounting with "-o abort_recovery" doesn't solve the problem,
>> are you able to mount the MDT filesystem as "-t ldiskfs" instead of
>> "-t lustre"?  Try that, then copy and truncate the last_rcvd file:
>>
>> 	mount -t ldiskfs /dev/MDSDEV /mnt/mds
>> 	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
>> 	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
>> 	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k count=1
>> 	umount /mnt/mds
>>
>> 	mount -t lustre /dev/MSDDEV /mnt/mds
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list