[Lustre-discuss] OSS misconfig and client connect

James Robnett jrobnett at aoc.nrao.edu
Wed Jul 31 12:44:06 PDT 2013

Many thanks Cliff.  One other question ...

The manual is fairly insistent that the filesystem be unmounted
on the clients as well as the OSTs, MDT etc are unmounted.

The vast majority of our clients are Infiniband and working fine.

Given the nature of the problem (i.e the new OSS only knew about
IB) do you think it's critical that we unmount the IB clients.

The 1gbit and 10gbit clients are naturally unmounted due to the
kernel panic.

My preference would be to simply unmount each OST on all the OSSes,
unmount the MDT, run writeconf on the MDS/MGS, remount the MDT
and the mount the OSTs.  I'd leave the IB connected clients
alone.  They would restore connectivity after the MDS and OSSes
came back up.

The whole process would only take a few minutes, less than the
recovery time.

Or do you think I'm just asking for trouble and should shut
everything down.  That's a painful process for the clients
but doable.

ps: I assume I have to actually unmount the OSTs.  I could believe,
given this instance, it might be ok/safe to just unmount the
MDS, run write conf on it and remount.

resending since I failed to reply to the list.

On 07/31/2013 12:43 PM, White, Cliff wrote:
> On 7/31/13 10:37 AM, "James Robnett" <jrobnett at aoc.nrao.edu> wrote:
>> I'm now suspicious that I need to unmount all the OSSes (for
>> correctness), unmount the MDS and run
>> tunefs.lustre --writeconf /dev/md0
>> on it to clear the logs and then remount.
>> Note we have a combined MDS/MGS.
> Yes. Since the configuration is held on the MDS, you need to do the
> --writeconf, then remount the servers.
> Procedure should be in the Lustre Manual
> Cliffw
>> James

More information about the lustre-discuss mailing list